1 Introduction

Flooding has caused tremendous economic losses and fatalities around the world (Najibi and Devineni 2018), and coastal cities are areas that are at significant risk from flooding (Aerts et al. 2014). The world population in coastal cities has increased 4.5 times in the last 70 years and coastal urbanization will likely continue in the coming decades (Barragán and de Andrés 2015). Urban areas are expanding to receive large influxes of people into cities (Dang et al. 2020). In recent decades there has been a massive migration of people towards the coastal regions of China (Niu and Zhao 2018), which has led to a large concentration of wealth and population in these areas (Dovern et al. 2014). This change increases coastal cities’ exposure to risks (Surjan et al. 2016), especially flood risk. The flood threat to coastal cities is getting worse due to climate change, which leads to a high degree of uncertainty in future flood risk (Deng et al. 2016; Lin et al. 2016; Gu et al. 2019; Fang et al. 2020). The threat of flooding in its diverse forms ultimately results in economic losses and human casualties. Therefore, an accurate assessment of flood inundation in urban areas in coastal cities is vital to understanding flood risks and providing precise disaster forecasting and emergency response (Yang et al. 2020; Yin et al. 2020).

To better understand flood risk, flood inundation mapping is critically important for identifying potential impact areas and assessing inundation depths to evaluate the severity of flood hazards. Due to the improvement of computer performance, as well as the simplification and development of the algorithms, hydrodynamic models have been widely used for flood simulations (Lesser et al. 2004; Yu and Lane 2006). As the primary topographic input data, Digital Elevation Models (DEMs) have proven to be a vital element in controlling hydrodynamic model accuracy (Kenward et al. 2000; Cobby et al. 2001). Open access DEM products have been widely used in flood simulation and mapping (Pedrozo-Acuña et al. 2015). However, the relatively poor resolution and accuracy of open access DEMs at present significantly limit the ability to estimate the inundation areas and relevant risks (Sampson et al. 2016). It has been demonstrated that low DEM data quality can lead to severe flood prediction biases (Hawker, Bates, et al. 2018). This is mainly affected by the spatial resolution and vertical error of DEMs. Low spatial resolution affects the delineation of surface features and the accuracy of the flood simulation (Vaze et al. 2010; Saksena and Merwade 2015). Elevation errors in the vertical direction can also affect the accuracy of the terrain simulation and hence the flooding simulation (Mukherjee et al. 2013; Talchabhadel et al. 2021). It has been recognized that accurate DEMs are critical for high precision flood modeling and management (Cook and Merwade 2009; Coveney and Fotheringham 2011).

Plenty of research has been done on improving DEMs for flood modeling. Digital Elevation Model noise correction and systematic DEM bias correction are two common control methods. In high-precision flood modeling, the error of terrain attributes reflected by DEMs will affect the simulation of channel morphology and bathymetry in fluvial floods. Coarse resolution of DEMs is problematic for determining channel morphology, signal attenuation of water bodies, or radar reflections, and this issue cannot be sufficiently addressed.

Errors in measurements appear as noise in the elevation data, including stripe noise or random speckle noise. Stripe noise affects extensive range but can be easily identified and eliminated, such as by using the two-dimensional Fourier filtering technique to detect unrealistic terrain undulations and remove stripe noise (Yamazaki et al. 2017). Speckle noise occurs randomly and in uncertain locations. Some studies have attempted to improve the DEM, including the possibility of reducing DEM noises through systematic editing, such as eliminating spurious pits or sinks (Hutchinson 1989; Soille et al. 2003). Noise can also be reduced by filtering, for example by applying an adaptive-scale smoothing filter to remove random speckle noise (Gallant 2011). Nonetheless, such a revised DEM does not necessarily perform well in any study area, as different DEMs have distinct performances in different regions (Wong et al. 2014; Elkhrachy 2018). Some methods revised errors based on the original product by deriving a completely new DEM, but high precision DEMs on a global scale still have the potential of having significant errors at local scales (Holmes et al. 2000). Some current studies attempt to obtain more accurate elevation data through field investigation, such as the LiDAR with higher terrain accuracy or ground control point interpolation with the help of unmanned aerial vehicles (Karamuz et al. 2020; Kim et al. 2020). However, the high economic cost and time required for acquiring data make such methods challenging to apply to a large-scale area in flood simulations (Aguilar et al. 2010).

Biases or artifacts in the DEMs are systematic errors due to the procedures used in the DEM generation (Wechsler 2007). Some studies focused on the vertical differences between different DEMs (Sanders 2007; Bhuyian and Kalyanapu 2018). However, most studies do not spatially quantify how the uncertainty reflected in the vertical error affects the results (Gesch 2018). Some studies evaluated better performing DEMs by comparing the quality of the data within one or more study areas (Du et al. 2016; Zhang et al. 2019), but this does not mean that DEMs can have the same performance in other areas of interest. There is also a category of studies based on global error metrics, such as determining upper and lower bounds on elevation error (Kyriakidis et al. 1999). Although the error in the DEM is not reduced, the boundaries of the error are determined. Some of the studies conducted error analysis through probabilistic studies, and the relevant tools included using the Monte Carlo technique (Wechsler and Kroll 2006) and sequential Gaussian simulation (Fereshtehpour and Karamouz 2018). Such operations do not produce a precise result but a map that contains the spatial distribution of possible errors, thus indicating the likelihood of any position falling above or below a specified elevation. A modified deterministic and probabilistic approach to vertical uncertainty is considered a better option than a simple deterministic approach that ignores the effects of elevation errors (Gesch 2018).

In this study, we generated a new DEM by taking advantage of several sets of DEMs through considering the interconnections and differences between current open access DEMs. We considered whether existing datasets can be corrected to eliminate errors and, based on this idea, provide a statistically based method and apply it in the DEM. This is an entirely new DEM but takes full advantage of every raster information of the original DEMs data. The significant benefits of this method are the convenience of the technique and the fact that missing or incorrect partial DEM data do not limit the results generated for the particular study area. The results generated by the method can also be continuously updated as the number of DEM products increases, and theoretically will be closer to the accurate elevation values. This study investigated the feasibility of the new statistically based method for eliminating DEM error. This was achieved by: (1) simulating and comparing the flooding results from the currently available open access DEMs; (2) analyzing and evaluating the performance of the newly generated DEM versus the original DEMs in flood simulation; and (3) discussing the error factors present in typical DEMs in the study area.

2 Materials and Methods

The following subsections present a brief introduction of the study area and the employed datasets and their properties, followed by the methods.

2.1 The Shanghai Study Area

Shanghai is located on the alluvial plain of the Yangtze River Delta in China, with a dense population of more than 24 million and produced over USD 550 billion (RMB 3.8 trillion) of gross domestic product (GDP) in 2019, which is 3.8% of China’s national GDP (SMSB 2020). It is a low-lying area, with an average elevation of 4 m. However, the estuary system’s average tidal amplitude can reach up to 4.6 m (Wang et al. 2012). Due to the low-lying terrain and the high tidal amplitude, Shanghai suffers complicated types and frequent high flood events and is one of the most vulnerable cities affected by floods in the world (Balica et al. 2012). More than 1,800 flood-related casualties were reported from 1949 to 2005 (Wen and Xu 2006). These devastating flood disasters occurred in association with various meteorological hazards, such as typhoons, heavy rains, prolonged precipitation, and riverine flooding. Typhoon Winnie in 1997 caused the Huangpu River to reach an extreme water level of 5.99 m (Du et al. 2015), with an inundation area of 495 km2 in the city center, causing seven deaths and direct economic losses of USD 80 million (RMB 670 million). Although the Typhoon Winnie event is extremely rare, Shanghai is still likely to suffer from severe flooding events in the future due to global sea level rise and land subsidence.

The Huangpu River is a typical lowland river in Shanghai, which originates from the Dianshan Lake—one of the lakes in the Taihu Lake Basin—and flows into the Yangtze River Estuary. It is the largest river in the Taihu Lake Basin, carrying 70% of Taihu Lake’s water flow (Yin et al. 2013). The mainstream of the Huangpu River extends from Mishidu gauge station in the upper reaches to the Wusongkou gauge station in the lower reaches, with a total length of about 75 km, running through the city center of Shanghai (Fig. 1). The mainstream of the Huangpu River was selected as the study area to illustrate the impact of different DEMs on inundation simulation. The study area constitutes the central part of Shanghai Municipality, covering over 3,000 km2, where more than 13 million people live.

Fig. 1 
figure 1

Location of the study area—the Huangpu River Basin—in Shanghai, China

2.2 Digital Elevation Models

We considered seven global DEM datasets that are freely accessible: SRTM, MERIT, CoastalDEM, GDEM, AW3D30, NASADEM, and TanDEM-X. As there is a lot of missing information in TanDEM-X for the study area, the first six datasets were chosen for the study (Fig. 2).

Fig. 2
figure 2

Six Digital Elevation Model (DEM) datasets used for the study in Shanghai, China (Inserted table: Basic information of the six DEMs used in this study)

SRTM: The first version of the Shuttle Radar Topography Mission (SRTM) products was released in 2003 with 30 m and 90 m horizontal resolutions. The published datasets have been processed using the interpolation algorithm to fill the data hole of SRTM (Reuter et al. 2007). The absolute elevation errors of SRTM at the 90% quantile (LE90) ranged from 5.6 cm to 9.0 m (Rodrigue et al. 2006). Both 30 m and 90 m resolution DEMs were adopted in this study, with the 90 m data used for error correction.

MERIT: The MERIT was developed by removing various error components, including absolute deviation, fringe noise, speckle noise, and tree height deviation from existing DEMs (SRTM, AW3D, Panoramas DEM). Data for all regions except Antarctica are provided. The absolute elevation error at the 90% quantile is 5 m (Yamazaki et al. 2017). We chose MERIT version 1.0.3 with a 90 m resolution for this study.

CoastalDEM: The CoastalDEM dataset is exclusively developed for coastal areas, which is based on SRTM data. Using machine learning techniques, the accuracy of coastal terrain in this dataset has been effectively improved by cutting ~50% root mean square error (RMSE) compared with SRTM (Kulp and Strauss 2018). Moreover, more coastal areas are exposed to floods based on this dataset (Kulp and Strauss 2019). This dataset has 30 m and 90 m versions, and only 90 m horizontal resolution has free access for non-commercial use, which was adopted in this study.

GDEM: ASTER GDEM is a global one-arc-second elevation dataset based on the products of the new Earth observation satellite Terra released by METI and NASA. It has the advantage of broad data coverage, covering most of the Earth’s regions between 83°N and 83°S latitude. The RMSE of ASTER elevations was estimated to be 8.68 m (Tachikawa, Hato, et al. 2011, Tachikawa, Kaku, et al. 2011). The GDEM version 2 released in October 2011 improves the spatial resolution by using 260,000 additional stereo pairs, and improves horizontal and vertical accuracy (Tachikawa, Hato, et al. 2011), and this version was adopted in this study.

NASADEM: NASADEM reprocesses SRTM data and merges with ASTER GDEM elevations to improve height accuracy. The new and improved SRTM heights in NASADEM come from better vertical control by referring to the Ice, Cloud, and Land Elevation Satellite (ICESat), and gaps in SRTM are reduced by using interferometric unwrapping algorithm (Crippen et al. 2016). NASADEM covers 80% of Earth’s land regions and includes land between 60°N and 56°S latitude. The version used in this study was released in February 2020 with 30 m horizontal resolution.

AW3D30: ALOS World 3D 30 m data are obtained by the Panchromatic Remote-sensing Instrument for Stereo Mapping (PRISM) on the advanced land observation satellite ALOS (Tadono et al. 2014). The elevation values are obtained by software calculations of the position of the same feature imaged on three cameras with different viewing angles, which effectively improves accuracy. AW3D30 is a resampled version for non-commercial use from AW3D5 with a grid size of 30 m with global coverage from 83°N to 82°S. The RMSE of AW3D versus 5,121 points distributed across 127 image tiles were 4.40 m (Takaku et al. 2016). The AW3D30 data in this study is from version 2.3 released in April 2019 with 30 m horizontal resolution.

Four sets of 30 m resolution (AW3D30, NASADEM, GDEM, STRM 30 m) and three kinds of 90 m resolution DEM data (SRTM 90 m version, MERIT, CoastalDEM) were selected for this study. Figure 2 shows the six sets of open access DEMs and their basic information. Among them, MERIT, CoastalDEM, and NASADEM are based on SRTM data modified by the algorithm. The GDEM data source is from the ASTER satellite, and AW3D30 is from the ALOS satellite.

The table in Fig. 2 shows the basic properties of the six datasets in the study area. CoastalDEM has the lowest average elevation and GDEM has the highest average height in the study area, and CoastalDEM’s average elevation is more than three times of GDEM. In terms of elevation distribution, the five sets of DEMs, except CoastalDEM, have the largest proportion of elevation in the more than 4 m range. In contrast, in the CoastalDEM only 2.0% of the study area is higher than 4 m, and nearly half of the area (43.8%) has less than 2 m elevation.

2.3 Methodology

The following subsections present the three main processes of the study including Digital Elevation Model processing, simulating flooding events, and assessing inundation simulation accuracy.

2.3.1 Digital Elevation Model Processing

We first extracted DEMs of the study area including coordinate transformation and clips. Then we used two methods to process the DEMs. Before processing, resampling was done to ensure that the spatial resolution of the six datasets was consistent. The resampling method was nearest neighbor, which has been found to lead to the highest accuracy in DEM resampling (Takagi 1998).

The first method to process the DEMs is to directly take the mean of the same row and column elevation values for the six datasets (Mean). The second one is to remove error elevations by the Dixon method before direct averaging.

The revision principle is based on the statistical method of eliminating bad values from the data, reducing errors, and thus improving the accuracy of the data. We extended the application to the processing of DEM data and analyzed the elevation data at the same location (rows, columns) of different DEM products, eliminating data with large errors and retaining data with elevations in the normal range compared to all DEM data at that location.

Mean: The original high-resolution data was first resampled to a lower resolution to ensure consistency across the datasets. Then a new elevation value z was generated by averaging the elevation data from several datasets of the original DEMs at the same row (x) and column (y), assigning that value to the same location, and so on to generate a new DEM.

Dixon criterion: Compared with the direct average, this method adds the process of eliminating the error value by the Dixon method, which is suitable for checking the consistency of a set of measured values when the amount of data is limited (Dixon 1950). This method screens out error values by using the extreme difference as a metric to estimate the difference between adjacent data and to identify excessive differences as anomalous data. Dixon’s test can directly detect outlier values by using the range ratio without calculating the arithmetic mean and standard deviation (SD) of the sample, and therefore is suitable for small-sample data. It performs well with small sample sizes and does not require assumptions about the normality of the data. This method is not ideal when the maximum and minimum values are both suspicious, or two suspicious values exist on the same side of the maximum (minimum) value.

In this study, the Dixon criterion thresholds refer to the national standards of the People’s Republic of China (SAMR 2008), as shown in Table 1. Since the number of samples (DEMs) is less than 10, the table lists only n values up to 10.

Table 1 Dixon criterion threshold, national standards of the People’s Republic of China

2.3.2 Flooding Simulation

We chose the hydrodynamic model LISFLOOD-FP to simulate flood inundation, which is widely used in flood simulation and mapping (Fewtrell et al. 2008; Zhao et al. 2020). LISFLOOD-FP is a simplified two-dimensional hydrodynamic raster-based inundation model (Bates and De Roo 2000). It can simulate floodplain inundation in a computationally efficient manner over complex topography, which can also be used for hydrodynamic simulation of the one-dimensional channel and two-dimensional flood area with two corresponding solutions. For one-dimensional channel water simulation, the simplified St. Venant equations are used in the model.

The hydrological simulation of a two-dimensional flood area needs to use the continuity equation and momentum equation based on the terrain data of the DEM. Considering the water balance of adjacent grids:

$$\frac{dV}{{dt}} = Q_{up} + Q_{down} + Q_{left} + Q_{right}$$
(1)
$$Q_{ij} = \frac{{A_{ij} R_{ij} S_{ij}^{1/2} }}{n}$$
(2)

where V is the total flow per grid, Qup, Qdown, Qleft, and Qright respectively relate to the flow rates of upstream, downstream, left, and right units adjacent to the grid, t represents time, and Qij represents the flow between grid i and grid j; Aij and Rij represent the cross-sectional area and hydraulic radius at the junction of adjacent grids i and j respectively; Sij represents the water slope between i and j, and n is the manning coefficient.

To assess the uncertainty in inundation simulation result caused by DEM errors, we carried out the sensitivity analysis by changing the input DEM while keeping other input conditions the same. We did not take the uncertainties such as sea level rise, ground subsidence, and embankment into consideration. In the water level height design, we referred to the water level height designed in relevant research (Yin et al. 2013). In the inundation simulation, we evaluated the differences between two sets of water level data of 50-year (50a) and 100-year (100a) return periods and compared the two scenarios’ average values.

2.3.3 Assessment of Inundation Simulation Accuracy

We used historical disaster conditions and relevant literature (Yuan 1999) to determine areas prone to flooding along the Huangpu River and assessed the accuracy of the inundation simulation by using the binary classification. The binary classification method is applicable in areas with small slope variations, with good performance in flood simulation (Stephens et al. 2014; Samela et al. 2017), and is therefore appropriate in this study. The two-dimensional matrix includes four scenarios, with accuracy based on a comparison with the observed and actual results, including dry and wet indicators, as shown in Table 2.

Table 2 Contingency table of the confusion matrix with true positive (TP), false negative (FN), false positive (FP), and true negative (TN) values

Once the observed area and simulation area are determined, the accuracy of the submergence simulation should be evaluated by calculating the F1 Score. The F1 Score is a measure of both precision and recall, representing the mean of the reconciliation of accuracy and recall rates. A higher F1 value indicates better accuracy:

$$\frac{{1}}{{{\text{F1}}}}{ = }\frac{{1}}{{2}}\left( {\frac{{1}}{{{\text{Precision}}}}{ + }\frac{{1}}{{{\text{Recall}}}}} \right)$$
(3)
$${\text{Precision = }}\left( {\frac{{{\text{TP}}}}{{\text{TP + FP}}}} \right)$$
(4)
$${\text{Recall = }}\left( {\frac{{{\text{TP}}}}{{\text{FN + TP}}}} \right)$$
(5)

Figure 3 shows a simple diagram that illustrates how to determine the quality of flooding simulation results by binary classification (Jafarzadegan and Merwade 2017).

Fig. 3
figure 3

An example for understanding the binary classification terms including the rate of true positive (TP) and false positive (FP), and the F1 score used to validate flood inundation simulation accuracy

3 Results

This section compares the inundation simulation results using six sets of original DEMs and two sets of processed DEMs. Considering that the proposed method resamples the DEMs to reduce the spatial resolution, we also compare two sets of DEMs with the same data source but different spatial resolutions and analyze these results.

3.1 Effect of Horizontal Resolution

We used the same data but with two resolutions, that is, the 30 m and 90 m SRTM datasets, to compare flood inundation under the same scenario setting. The results are shown in Fig. 4. Comparing the results of the two datasets, we found that the SD, MAX, and inundation area of the flood simulation results increase as the return period increases. The most significant change is in inundation extent. The change in inundation area is 20.7% between the two return periods for the 30 m resolution DEM and 18.6% for the 90 m resolution DEM. The smallest change is in SD, where the change is 2.9% for the 30 m resolution DEM and 2.6% for the 90 m resolution DEM. This result illustrates that the flooding simulation error increases as the inundation level increases with the increase of the return period. But the choice of different spatial resolutions may lead to differences in the results, where coarser resolutions lead to a degradation of SD and maximum water depth while increasing the inundation area, but the overall change is not significant.

Fig. 4
figure 4

Comparison of the results of flood inundation simulation using two SRTM datasets under two inundation scenarios (50a and 100a return periods) on the Huangpu River in Shanghai, where 4a and 4c represent results using the original 30 m SRTM dataset, and 4b and 4d represent results using the original 90 m SRTM dataset. SD represents the standard deviation (in meters), Max represents the simulated maximum water depth (in meters), and Area refers to the inundation area (km2)

3.2 Cross-comparison of Eight Sets of Digital Elevation Models

The results of the simulation using the eight sets of DEMs under two flooding scenarios are shown in Fig. 5. The results using the six sets of original data show an increase in the maximum inundation depth, SD, and inundation area with the increase of the return period, which indicates that the error of the simulation increases with the increase of the return period. The results of the simulation using DEMs processed by the Mean method and the Dixon method also follow this rule. Comparing the results of the same DEM under different flooding scenarios, and the effect of different DEMs used under the same flooding scenario, shows that the effect of different DEMs on the results of the flooding simulation is significantly greater than that of the scenario selection, which also shows the importance of selecting a suitable DEM for flooding simulation.

Fig. 5
figure 5

Results of flood inundation simulation using six sets of raw DEM data and two sets of processed DEM data under two inundation scenarios (50a and 100a return periods) on the Huangpu River in Shanghai. SD represents the standard deviation (in meters), Max represents the simulated maximum water depth (in meters), and Area refers to the inundation area (km2)

The maximum inundation depth using CoastalDEM is the largest of the eight datasets for both the 50a and 100a flooding scenarios, whereas GDEM shows the smallest inundation depth. The maximum inundation depths for the remaining six datasets ranked in the middle and are less different from each other. The results obtained using the two sets of processed DEM significantly reduce the depth of extreme water levels compared to the original DEMs, with the results using the DEM generated from the direct averaging process being more pronounced. The Dixon criterion also significantly reduces the maximum depth of inundation compared to the original data.

The inundated area results in the study area are shown in Fig. 6. The simulated inundation area varies significantly between the different DEMs. The smallest inundation area results from using GDEM—the average inundation area is less than 2 km2. In comparison, the most massive inundation area results from using CoastalDEM, with an average inundation area of more than 1600 km2. The CoastalDEM simulation results show that the inundation area is significantly higher than from the other datasets, with inundation areas exceeding 50% of the study area for both the 50a and 100a scenarios, while other datasets are less than 15%. The difference in the predicted area between the maximum inundation area and the minimum inundation area is more than 1200 times. In addition, inundation ranges from using different DEMs have various sensitivities to water level height settings, with the Mean being the most sensitive. The predicted inundation range under the 100a water level scenario is 1.7 times that of the 50a water level scenario, and CoastalDEM is less sensitive, with a predicted range of 1.1 times under the 100a water level scenario compared with the 50a.

Fig. 6
figure 6

Simulated inundation area of the Shanghai study area under two return periods (50a and 100a)

Comparing the two sets of processed DEMs, the inundation range of the simulated results, after directly averaging elevation, is smaller than that of the processed results of the Dixon method. This is due to the excessive height of GDEM affecting the final results, while the difference between the results of the Dixon method and the Mean shows the superior performance of the Dixon method in outlier identification and error elimination.

3.3 Assessment of Inundation Simulation Accuracy

Based on the binary approach presented in Sect. 2.3.3, F1 values for using different DEMs under the two return periods are shown in Table 3, where a higher F1 value indicates better overall accuracy.

Table 3 F1 scores for using different Digital Elevation Models (DEMs) under two return periods (50a and 100a) in Shanghai

Comparing the F1 values of simulation using the six sets of raw DEM data, AW3D30, MERIT, NASADEM, and SRTM perform better with water level deepening. GDEM remains unchanged and CoastalDEM performance decreases due to further expansion of the inundation extent, resulting in an overprediction. MERIT performs best among the original DEM datasets in determining the inundation area. Although using CoastalDEM the simulation identified most waterlogging-prone regions, the accuracy of the result ranks fairly low due to the high inundation range and large error. GDEM identifies most areas as dry, however, and its accuracy is the lowest due to considerable ignorance in inundation area identification.

Comparing the two modification methods, there is a considerable gap between the simulation accuracy of the DEM processed by the two methods. Except for GDEM, the F1 values of the direct averaging elevation method are even lower than all the original data, indicating that the process ignores the error of elevation blindly and averaging not only cannot improve the value of the anomalous areas, but also result in error of some accurate elevation values. In contrast, the data corrected by the Dixon criterion obtained the highest F1 values among all groups of data, which reflected the positive effect of DEM correction.

4 Discussion

Comparing the results between 30 m and 90 m SRTM datasets, we found that coarser resolution leads to an increase in inundation extent, which is consistent with previous research (Saksena and Merwade 2015; Lim and Brandt 2019). Coarser resolution reduces the maximum depth of inundation, which may be due to more considerable anomalous heights being smoothed into coarse height values by the surrounding lower altitudes. It may also neglect the extreme inundation areas if relatively local low-lying regions exist. The use of different DEMs for the same scenario simulation has a significant effect on the inundation simulation, where the GDEM with the smallest inundation area differs by a factor of over 500 times from the CoastalDEM with the largest inundation area. This indicates that the choice of DEM input has a significant impact on urban flood simulation and demonstrates the critical importance of selecting a suitable DEM for flood modeling.

The quality of DEMs in flood modeling has also been widely considered (Schumann and Bates 2018). Previous studies pointed out that those openly accessible and widely used DEMs, SRTM for instance, were acquired in the 2000s and have numerous errors (Hawker, Bates, et al. 2018, Hawker, Rougier, et al. 2018). Although constantly updated DEMs, such as MERIT, NASADEM, and CoastalDEM, have performance improvements, it does not mean that they are entirely accurate. For example, in our study, the CoastalDEM misidentifies large areas of high-rise buildings in downtown Shanghai as being below sea level—as shown in Fig. 7, the area with a high density of high-rise buildings corresponds to the area with the lowest elevation in CoastalDEM, which illustrates the anomaly of CoastalDEM. It may overestimate the inundation areas and exposure.

Fig. 7
figure 7

Comparison of CoastalDEM elevation (left), OpenStreetMap (OSM) buildings data (middle), and kernel density estimation result based on OSM data (right) in Shanghai

The SRTM continues to use the old boundary, neglecting on-going coastal land reclamation in the past two decades. It makes coastal flood simulation more difficult as the current DEM does not show the real land boundary. Thus, particular caution is needed in flood modeling, especially in areas with significant topographic changes, coastal areas, and low-lying flat areas.

This article presents a statistically based approach designed to provide flood modelers with a fresh approach to combining current open access datasets to complement each other to achieve possible performance improvements. Although multiple DEM correction methods are available, they have been limited to correcting one DEM. Our approach makes correction to the same region possible based on various sets of currently available DEMs. It can combine the advantages of different datasets, screen out the outliers with errors, and can be used in plug and play mode—it only needs to modify the region of interest before simulation, and is less time and cost consuming.

Although highly accurate terrain data improve the precision in terrain simulation, the data are not suitable for direct use in flood simulation due to the limited computational performance. Halving the resolution of the simulation input data resulted in a 10-fold increase in computational costs (Savage et al. 2016). Thus, even if higher resolution DEMs can be used, they may only be modelled in a coarser resolution to run the simulation. This means that flood modeling still requires the use of coarse-scale DEMs at present, which sacrifices the advantages of highly accurate topographic data and exposes a disadvantage to the time and economic cost due to highly accurate topographic data in flood modeling. Thus, relatively precise terrain data that maintain current computing performance is a more cost-effective research direction. The idea proposed in this article deals with removing some error points present in the DEMs at the current resolution and getting some performance improvement. We hope that the idea can help flood modelers promote the quality improvement of terrain data in flood simulations. It requires more effort to understand the DEM effects in flood simulation as well as in other Earth system simulations.

5 Conclusion

In this study, we focused on comparing the inundation simulation results of six open access DEMs and proposed a new method to eliminate DEM errors in the study area of Shanghai. The study resulted in three main findings:

  1. 1.

    From the sensitivity analysis, by setting the same simulation conditions but only changing DEMs for flooding simulation, the results show significant differences, such as the difference between the inundation results of CoastalDEM and GDEM. This implies that more care needs to be taken in DEM selection in flood simulation. It is necessary to select a more appropriate DEM in the preparation phase of future flooding simulations.

  2. 2.

    Our study also found that even when selecting the same DEM, different spatial resolutions for flooding simulations also lead to differences in results. As the spatial resolution of the DEM decreases, the predicted flood inundation area and the maximum inundation depth increases for all DEMs. This implies that the selection of a coarser DEM may lead to more errors in the inundation results. Nevertheless, the effect of spatial resolution on the difference of inundation results is much smaller compared to the choice of DEM in flooding simulation.

  3. 3.

    Although the inundation depth results are difficult to compare due to the scarcity of historical disaster records, the predictive performance of the elevation data for inundation areas after Dixon criterion and error elimination processing is improved compared to all six sets of original data. The potential of the method is demonstrated, and a new way of thinking is proposed for flooding simulations that suffer from current DEM limitations. This method can have the effect of reducing topographic errors by cross-checking all DEM data that can be obtained against each other in a horizontal comparison. At the same time, the idea has operability and can be used by researchers to autonomously determine the study area as well as the error eliminating method without waiting for upgraded products to achieve the error reduction. In future research, we will continue to maintain the advantages of this idea of convenience and flexibility, and further investigate the methods of error identification such as machine learning to achieve more accurate results.