1 Introduction

The strong winds and low pressures of tropical cyclones (TCs) often induce highly damaging storm surges, affecting people and economies over large coastal areas. In 2017, U.S. hurricane damage totals exceeded $265 billion, with Hurricanes Harvey, Irma, and Maria entering the top five of costliest hurricanes in recorded history (NOAA National Centers for Environmental Information 2018). Storm surges are influenced by TC intensity, size, and track and can be amplified by shallow coastal bathymetry or local geometry (Mori et al. 2014). Hence, even relatively weak TCs can induce high storm surges under certain conditions.

Hydrodynamic models are used to simulate storm surges, both for operational applications and risk assessments. These hydrodynamic models use wind speed and mean sea-level pressure (MSLP) as forcing, which is usually derived from general circulation models (GCMs). Until recently, these GCMs had horizontal resolutions of 0.45°–1.8° (ca. 50–200 km at the equator) (Saha et al. 2006; Yukimoto et al. 2006). Currently, all available climate reanalysis products have horizontal resolutions of up to 0.75°, including ERA-Interim (Dee et al. 2011) and NCEP/NCAR Reanalysis 1 (Kalnay et al. 1996). Such resolutions are insufficient to fully resolve TC intensity, size, and track (Murakami and Sugi 2010; Schenkel and Hart 2012; Walsh et al. 2007), and are especially problematic for weak TCs (Hodges et al. 2017; Murakami 2014). At the local scale, many studies have employed parametric models (Haigh et al. 2014; Harper and Holland 1999; Holland 1980; Lin and Chavas 2012; Lin et al. 2010) to obtain high-resolution wind and pressure fields. Such models fit MSLP and 10 m wind speeds (U10) to radial profiles with an exponential decay away from the eye. Limitations of such parametric models include the fact that they do not fully capture asymmetric cyclones (Harper and Holland 1999) and do not include dissipation effects over land (Jakobsen and Madsen 2004). Other studies have used high-resolution (down to 50 m) hindcasts to simulate TC characteristics and surge heights (Bunya et al. 2010; Dietrich et al. 2010). These hindcasts are based on regional downscaling and/or regional climate models, and consequently, hindcasts are not applicable in regions with sparse observational data (Nikulin et al. 2012).

A limited set of GCMs is run at horizontal resolutions of 10–30 km (0.09°–0.27°), a scale at which TCs can be resolved (Bacmeister et al. 2016; Mizuta et al. 2012). With the launch of ERA5 (0.25°) in 2017, reanalysis products are now also available at these horizontal resolutions. One high-resolution GCM is the European Centre for Medium-Range Weather Forecasting (ECMWF) Integrated Forecasting System (IFS). In addition to being an operational weather forecasting model, IFS has been used for producing reanalysis products such as ERA5 and ERA-Interim (Dee et al. 2011; ECMWF 2017c). In ERA-Interim, the average number of TCs per year is simulated well (Strachan et al. 2013), but modeled tracks differ from observed ones, and the intensity and size are underestimated (Murakami and Sugi 2010; Schenkel and Hart 2012). The underestimation of intensity is usually driven by resolution effects and poor physical schemes (Flato et al. 2013; Walsh et al. 2007). Previous updates in the ECMWF IFS tropical atmospheric conditions have improved tropospheric wind and convection compared to observations, (Fiorino 2008) and the model’s spatial resolution was increased from ± 0.225° at the equator in 2006 to its current resolution of ± 0.08° (ECMWF 2017a). These updates have significantly contributed to the improvement of the IFS TC track ensemble forecasts. A recent example of the improved performance is the IFS ensemble forecast for Hurricane Sandy’s track, which predicted Sandy’s landfall up to seven days in advance (Bassill 2014; Magnusson et al. 2014).

Apart from track ensemble forecasts, TC intensity forecasts have also improved in the latest model updates (ECMWF 2017a). Together with the emergence of global hydrodynamic models (Carrère and Lyard 2003; Jagers et al. 2014; Verlaan et al. 2015; Vitousek et al. 2017), it is possible to simulate TC surges at local to global scales using direct output from GCMs. These simulations are already carried out operationally, such as for the Atlantic Ocean using the NHC TC advisories, a parametric wind model and SLOSH model (Byrne et al. 2017; Jelesnianski et al. 1984). However, no research has been conducted on the use of ECMFS IFS meteorological forcing for high-resolution storm surge modeling. In addition, despite research focusing on methods to test the sensitivity of simulated storm surges to TC wind fields (Cardone and Cox 2009), few studies have analyzed the effects of the resolution of meteorological forcing on simulated storm surges. Wakelin and Proctor (2002) used three meteorological operational analysis datasets to analyze two storm surge events in the Adriatic Sea and concluded that their model works best using meteorological forcing with the highest spatial and temporal resolution. Recent research by Muis et al. (2016) has demonstrated the implications of using coarse-resolution meteorological forcing for global storm surge modeling. They generated time series of storm surges on a global scale using the six-hourly 0.75° ERA-Interim dataset (Dee et al. 2011; ECMWF 2016) and found that extreme sea levels induced by TCs are underestimated due to the coarse resolution of meteorological datasets. This raises the question: what resolution of meteorological forcing is needed to adequately simulate TC-induced storm surges?

In this paper, we test the suitability of ECMWF IFS as meteorological forcing for high-resolution global storm surge modeling. In addition, we analyze the effect of the horizontal resolution of meteorological forcing on maximum storm surge heights. We explore and discuss possible drivers of maximum surge heights.

2 Methodology

The overall methodology is illustrated in Fig. 1. The U10 and MSLP are derived from the ECMWF IFS and aggregated from their original resolution to T799 resolution (± 0.225° at the equator) and to various coarser resolutions between 0.25° and 1.0° (Sect. 2.1). Relevant meteorological parameters for the analysis (maximum U10, minimum MSLP, TC size) are derived by tracking each TC (Sect. 2.2). The U10 and MSLP fields are then used to model the associated surge heights using the Global Tide and Surge Model (GTSM) (Sect. 2.3). Storm surges modeled at T799-resolution forcing are compared with observations (Sect. 2.4). Lastly, the effects of different horizontal resolutions of meteorological forcing on maximum surge heights are explained through TC size and intensity, and coastline complexity and slope (Sect. 2.5).

Fig. 1
figure 1

Schematic overview of the approach followed in this study. Meteorological forcing is extracted from the ECMWF integrated forecasting system (IFS) at the native grid resolution (T799, ± 0.225°). Comparison to observations is performed for the 10 m wind speed (U10), mean sea-level pressure (MSLP) and maximum surge height (Hs). Land maps to derive coastal complexity are taken from the global administrative areas (GADM)

We focus on eight case studies of historical TCs (Fig. 2), one in each TC basin. We only consider landfalling TC events occurring after 5 June 2007 0 UTC, to allow for use of the new 4D-VAR data assimilation scheme, which considerably improved clouds and convection in IFS and thereby tropical troposphere forecasts (ECMWF 2017a). In the November 2007 update, lower tropospheric winds in the tropics are also improved.

Fig. 2
figure 2

Overview of the eight selected case studies. Colors indicate the TC intensity on the Saffir–Simpson scale

2.1 ECMWF IFS forcing

For each case study, U10 and MSLP data are extracted from the ECMWF IFS (ECMWF 2017a). The ECMWF general circulation model is used for numerical weather predictions and consists of a dynamical, physical, and coupled ocean wave component (Persson and Grazzini 2005). The temporal resolution is 3 h. Because of continuous updates in IFS resolution (ECMWF 2017a), original model resolution varies between the different TCs. Therefore, we homogenize the horizontal resolution of all cases to T799 resolution. We use the 0 and 12 UTC runs and their + 3 h, + 6 h and + 9 h forecast runs. Upscaling is achieved by averaging over neighboring grid cells (first-order conservative remapping on all spatial variables in the dataset) (Jones 1999). This process is likely to produce lower differences in wind and pressure intensities between the different resolutions than re-running the global atmospheric model on a coarser resolution would, because small coherent features are typically poorly resolved in numerical models at coarser resolution, but they can still be included when upscaling from a high-resolution to a lower-resolution grid (Boer and Denis 1997).

2.2 TC tracking algorithm

To capture the TC characteristics at every time step and to enable comparison with the IBTrACS dataset (v03r10), we track the eight TCs and their characteristics throughout their lifetimes. We use IBTrACS because it is considered the most complete best-track dataset of global historical TC activity (Knapp et al. 2010). The cyclone’s position as given in IBTrACS is taken as the initial position of the eye in the tracking algorithm. The spatial resolution in IBTrACS is generally listed at 0.1°, whereas the spatial resolution in ECMWF IFS is approximately 0.225° at the equator. Because of this difference, we apply the tracking algorithm from Baatsen et al. (2015) to ensure we are looking at the ‘true’ position of the eye in ECMWF IFS. Following this tracking algorithm, we determine the location with the maximum relative vorticity (a measure of the rotation of the horizontal velocity field) is in a surrounding 5° × 5° box from the initial position of the eye. If this location corresponds to a lower MSLP than the initial position, the position of the eye is updated. We then set the location with the minimum MSLP in a surrounding 2.5° × 2.5° box as the final position of the eye. Using the tracking algorithm from Schenkel and Hart (2012), we extract the maximum U10 and minimum MSLP at every time step within a 7° radius of the final position of the eye. Using this 7° radius, we ensure that these two TC characteristics are captured inside the domain. Following Chavas et al. (2016), we determine TC size by using the radius of vanishing winds r0, defined as the average distance outside of the eye where U10 < 12 m/s.

2.3 Storm surge modeling

For each TC, storm surges are simulated by forcing GTSM with meteorological data (U10 and MSLP) from the ECMWF IFS. GTSM is a global hydrodynamic model implemented with unstructured grids, based on the Delft3D FM software developed by Deltares (Kernkamp et al. 2011). GTSM has a spherical grid with thinning at high latitudes, with cell size dependent on the bathymetry (also known as courant grid refinement) (Irazoqui Apecechea et al. 2017). Additional refinement is applied in areas with steep slopes, such as mid-oceanic ridges, to improve the representation of the internal tides. This allows for high computational efficiency with high resolution (lower than 7.5 km, and on average 5 km) near coasts and coarser resolutions (up to 50 km) in the deep ocean. The General Bathymetric Chart of Oceans (GEBCO) 2014 dataset (https://www.gebco.net/data_and_products/gridded_bathymetry_data/), defined in a 30″ grid, is used for bathymetry. The computational time step is 150 s.

GTSM and the output dataset GTSR (Muis et al. 2016) are used in many recent research, including Hiroaki et al. (2017), Irazoqui Apecechea et al. (2017), Muis et al. (2017), Vousdoukas et al. (2018) and Williams et al. (2018).

Muis et al. (2016) used 6-hourly ERA-Interim data (at 0.75°) as meteorological forcing in GTSM to obtain a global reanalysis of storm surges (1979–2014). They validate modeled sea levels against observed sea levels using a global set of 472 tide gauges stations from the University of Hawaii Sea Level Center (available at https://uhslc.soest.hawaii.edu/). A validation of the surge levels shows that 95% of all stations have a root–mean-square error (RMSE) lower than 0.2 m, with the average RMSE being 0.11 m (standard deviation 0.05 m). Extratropical storm surges are modeled relatively well, whereas TC storm surges are substantially underestimated. This is shown by the average correlation coefficient in tropical regions of 0.77 being significantly lower than the average correlation coefficient of 0.87 in extratropical regions. This underestimation is driven by the relatively coarse resolution of the meteorological forcing, which is unable to fully capture the strong wind and pressure gradients in the TCs in both space and time.

A storm surge is a rise of the sea level as a result of changes in atmospheric pressure and wind drag on the sea surface. The influence of atmospheric pressure is given by the inverse barometer effect (Ross 1854): every 1 hPa drop in atmospheric pressure is accompanied by a roughly 0.01 m increase in sea-level height. In addition, in shallow water there is an additional wind set up that can be approximated roughly as:

$$g\frac{{\partial h}}{{\partial x}}={C_d}\frac{{{U^2}}}{H},$$
(1)

where g is the gravitational constant (m/s2), h the surface level above the reference height (m), x the horizontal distance (m), Cd the drag coefficient (−), U the average wind speed at 10 m perpendicular to the coast (m/s) and H the total water depth (m) (Weenink 1958). From this equation it follows that the largest surges occur in shallow water with a wide coastal shelf.

Hourly output data are extracted from the GTSM coastal grid points. For each TC, we consider an area of 15° × 15° around the landfall location and a time period of 3 days on either side of the moment of landfall. In this time period, we then select all coastal points in the T799-resolution forcing at which the maximum surge height is at least 50% of the overall maximum surge height, with a minimum height of 15 cm. This way, only coastal points with high storm surges are included in the statistical analysis. For the other resolutions, the same set of coastal grid points in GTSM is used, to ensure a direct comparison in storm surge heights at coarser resolutions.

2.4 Comparison of results

We compare the minimum MSLP and maximum U10 in the T799-resolution forcing against IBTrACS. The MSLP is given as an instantaneous value in both datasets. The U10 is given as a 7.5-min average in the T799-resolution forcing, and the observed U10 is the 10-min average wind speed in 3- or 6-hourly intervals. Since the conversion factor between these two averages is approximately 1 (Harper et al. 2008), we directly compare the two variables throughout this paper.

Before analyzing surge heights at coarser resolutions, we first need to demonstrate that our IFS-GTSM model setup is sufficient in simulating maximum surge heights. To do so, we analyze the performance of the model setup at the T799-resolution forcing by comparing the maximum surge heights modeled with the T799-resolution forcing to observed maximum heights. Because of the dense tide gauge network on the U.S. mainland, it is possible to compare modeled and observed storm surge heights for Irma and Sandy at multiple locations along the coastline. For this, we take tide gauge stations within 250 km of the TC track and subtract the daily maxima of tides from the daily maxima of the observed sea levels to calculate skew surge (NOAA 2017). Since these sea levels are referenced above the mean sea level, we correct for mean sea-level trends by removing the monthly mean sea level. We compare the tide gauge measurements to neighboring GTSM coastal grid points. For the other TCs, the observed maxima and any applied corrections are taken from the available literature.

2.5 Coastal slope and complexity

Apart from being driven by meteorological factors such as U10 and MSLP, storm surge heights can be further amplified when the surge is interacting with shallow coastal bathymetry and coastal complexity (Mori et al. 2014). For this reason, we will also look at coastal slope and coastal complexity as drivers for changes in storm surge heights between different resolutions.

The coastal slope is derived from GEBCO, and calculated as the average slope between the coastline and the bed level 100 km off the coast, perpendicular to the coastline.

Coastal complexity is assessed by calculating the fractal dimension D of the coastline around the landfall location (Mandelbrot 1967). A fractal dimension is the ratio of change between pattern details and measuring scales, calculated using different length scales to measure the length of the outline of an object, such as a coastline. The values of D lie between 1 and 2 for coastlines, where a high D implies a more complex coastline. To calculate D, we use high-resolution country maps (30 m) from the database of Global Administrative Areas (GADM 2017) and length scales between 1 and 100 km. The algorithm for calculating the coastline complexity is based on Hijmans (2016).

3 Results and discussion

3.1 IFS-GTSM model performance at T799-resolution forcing

3.1.1 Comparison of U10 and MSLP

The modeled and observed U10 and MSLP for all TCs are listed in Table 1. Spatial plots of U10 and MSLP at landfall can be found in Supplementary Material Figs. 1–4. We see that the modeled MSLP and U10 intensities are generally underestimated in the T799-resolution forcing as compared to the observed values. The modeled MSLP values are up to 60–70 hPa higher than the observed values (Hurricane Patricia and Typhoon Haiyan). Conversely, Cyclone Gonu has a lower modeled MSLP as opposed to the observed value (15 hPa). Although in most cases, the underestimation of U10 is between 10 and 30 m/s, Patricia’s U10 is underestimated by almost 50 m/s. These intensity underestimations for Patricia and Haiyan are likely related to the failure of the data resolution to fully capture their small eyes. The T799-resolution forcing is known to cause considerable intensity underestimations for relatively small TCs with a small eye (ECMWF 2017b), as was the case for Patricia and Haiyan, which had eyes of 13 and 15 km in diameter, respectively.

Table 1 Overview of modeled (Mod) and observed (Obs) variable values at landfall for the T799-resolution forcing. Observed values are extracted from IBTrACS (Knapp et al. 2010)

The R2-values show that there is good agreement between the observed and modeled values. However, the R2-values for Gonu are low compared to the other TCs. This discrepancy is likely due to the IFS update in November 2007, which significantly improved U10 values in the tropics (ECMWF 2017a).

3.1.2 Comparison of maximum surge heights

For Irma and Sandy, tide gauge records can be used to analyze the modeled maximum surge heights. The results are shown in Fig. 3. The R2-values are 0.86 for Irma and 0.74 for Sandy, demonstrating a good fit between the modeled and observed surge heights. These results show that GTSM is capable of capturing the spatial variability in surge heights in both cases, as is also shown in panels c and d of Fig. 3. However, when zooming in to the local level, we notice some deviations from observed values. Underestimations in modeled storm surges can be caused by various factors. One of these factors is that bays and estuaries are in general not captured by GTSM’s grid resolution (approximately 5 km near the coastline). In addition, uncertainties imposed by the meteorological forcing can also cause lower modeled storm surges. Overestimations in the modeled storm surges may be caused by differences in the locations of the coastal points and the tide gauges, such as a GTSM grid point at the coast versus a tide gauge located in a harbor or a semi-open inlet.

Fig. 3
figure 3

Upper panels show scatter plots of the modeled and observed maximum storm surge heights for Irma (a) and Sandy (b). Lower panels show the modeled and observed (dots) maximum storm surge heights for Irma (c) and Sandy (d). Observations are taken from NOAA tide gauge stations (14 stations for Irma, 22 stations for Sandy)

Because we compare the tide gauge locations to nearby GTSM coastal grid points, Irma’s maximum modeled surge height of 2.6 m near Everglades City (Table 1) is not included in the scatter plot (Fig. 3a). The nearest tide gauge station was located at Fort Myers, approximately 100 km north of Everglades City, so that a GTSM coastal point closer by was selected.

From Table 1, we see that Sandy’s modeled maximum U10 is more than 50% lower than observed. From the quadratic relation between wind and surge (Eq. 1), we would expect a 75% lower surge, but this is not seen in the simulations (Fig. 3). This is likely due to the fact that resolution effects of the climate model do not only lead to an underestimation of wind intensity and the pressure drop, but can also lead to an overestimation of the storm size (known as numerical diffusion). The relatively large wind field increases the storm surge, which compensates for the TC intensity underestimation.

For the other case studies, we refer to the maximum surge heights around the landfall location listed in the literature (see Table 1). The differences between the modeled and observed maximum surge heights are lower than 0.5 m for five TCs: Hurricanes Irma and Sandy and Cyclones Giovanna, Nargis, and Gonu. For Patricia, the modeled maximum surge height is approximately 0.2 m. However, there is no mention of a storm surge in the official tropical cyclone report (Kimberlian et al. 2016), from which we conclude that any possible storm surge would have been low. This conclusion is in line with our model results.

Larger differences between the modeled and observed maximum surge heights are seen for Haiyan and Yasi. In both cases, the maximum surge height was recorded in an inlet. Because of GTSM’s resolution of ~ 5 km near coastlines, these inlets are not (fully) captured by the model. For Haiyan, it is likely that the strong intensity underestimation adds to the underestimation of the storm surge.

Based on the performance, we conclude that the IFS-GTSM model setup at the T799-resolution framework is capable of capturing large-scale spatial patterns of maximum surge height sufficiently well for the analysis on the effect of using lower-resolution meteorological forcing.

3.2 Horizontal resolution effects

Our results confirm that storm surge simulations using coarse-resolution meteorological forcing generally result in lower storm surge heights (Wakelin and Proctor 2002). These reductions are shown in Fig. 4, where the gradual decrease in slope for the different scatter plots shows that the maximum surge heights at GTSM coastal points decrease with decreasing horizontal resolution. Scatterplots for the other TCs can be found in Supplementary Material Fig. 5.

Fig. 4
figure 4

Scatterplot of maximum surge height at T799-resolution forcing vs. other resolutions for a Hurricane Irma (Florida), b Cyclone Giovanna (Madagascar), c Cyclone Yasi (Australia) and d Cyclone Nargis (Myanmar)

The differences in surge heights are illustrated in Fig. 5, which displays surge heights during the storm’s lifetime for Irma, Giovanna, Yasi, and Nargis at T799 (left) and 1.0° (center) resolutions, and their difference (right). Differences in surge heights for the other TCs can be found in Supplementary Material Figs. 6, 7, 8, 9. We calculate the relative difference in maximum surge heights at all selected GTSM coastal points (black dots in Fig. 5) to illustrate the resolution effects for the different TCs (see also supplementary material, Table A.3). For both Giovanna and Irma, relative differences in the average maximum surge height between T799- and 1.0°-resolution forcing amount to 39%. Patricia, Sandy, Yasi, and Gonu each have relative differences smaller than 20%, and the largest relative difference is found for Nargis with 47%.

Fig. 5
figure 5

Maximum surge heights at T799- and 1.0°-resolution forcing and difference in maximum surge heights during the storm’s lifetime between T799- and 1.0°-resolution forcing for ac Hurricane Irma (Florida), df Cyclone Giovanna (Madagascar), gi Cyclone Yasi (Australia) and j, k Cyclone Nargis (Myanmar). Black dots represent GTSM coastal points used in the statistical analysis

For the remainder, we focus on the absolute differences in the average maximum surge heights. These absolute differences can directly affect inundation depths and flood risk estimates (De Moel et al. 2012). Comparing the average maximum surge heights for simulations using different resolutions, Table 2 shows that there are four TCs for which the absolute difference is approximately or less than 0.2 m for all resolutions: Patricia, Haiyan, Giovanna and Gonu. For Sandy and Yasi, the maximum differences are approximately 0.35 m. The largest differences are found for Irma and Nargis, with maximum surge heights around 1 m lower in the 1.0°-resolution forcing. These results show that six out of the eight TCs can still be modeled relatively well at low resolutions, with maximum storm surge underestimations lower than 0.5 m, whereas for Irma and Nargis, meteorological forcing resolutions lower than 0.75° result in storm surge underestimations of around 0.8 m. Underestimations of this magnitude have a considerable effect on impact calculations (De Moel et al. 2012).

Table 2 Average maximum surge heights at the GTSM coastal points for the different TCs

From Table 2, it follows that the difference in simulated maximum surge heights between different model resolutions is larger for higher storm surges. The height of a storm surge is driven by a combination of factors, which can broadly be classified into TC characteristics and geographical characteristics. The TC characteristics include intensity (measured via maximum U10 and minimum MSLP) and TC size (Irish et al. 2008). In addition, storm surges can be amplified by certain geographical characteristics, most importantly coastal slope and coastal complexity (Mori et al. 2014). We represent coastal complexity here as a fractal dimension D, where higher values of D imply a more complex coastline. Table 3 shows the TC and geographical characteristics for our eight case studies at landfall. We see that intensity (U10 and MSLP) alone cannot explain the maximum storm surge heights. This insight corresponds with the results of Irish et al. (2008), who have shown that TC size also has a large effect on the storm surge (though we use the radius of vanishing winds (Chavas et al. 2016), where they use the radius to maximum winds as a proxy for TC size). In our cases, the effect of TC size is apparent with the storms Sandy and Yasi, which resulted in large storm surges, despite their relatively low maximum wind speed and high pressure. However, TC size alone is not enough to explain storm surge magnitudes, as some small storms (such as Nargis) still result in high storm surges.

Table 3 TC (upper three rows) and geographical (bottom two rows) characteristics at landfall. All values are modeled values

Geographical characteristics also influence storm surges. Storms that make landfall on coasts with a low complexity and steep slopes generally result in low surges (e.g., Giovanna, Gonu, Patricia), while storms that make landfall on complex and shallow coastlines are associated with larger storm surges. Overall, both TC and geographical characteristics influence the size of the storm surge and, correspondingly, the underestimation that occurs when a coarse-resolution meteorological forcing is used.

4 Concluding remarks

In this paper, we have assessed the suitability of the ECMWF IFS as meteorological forcing for high-resolution storm surge modeling with GTSM. For this, we compared the modeled maximum surge heights of Hurricanes Irma (2017) and Sandy (2012) with observations from tide gauge stations. We found R2-values of 0.86 and 0.74 for Irma and Sandy, respectively, demonstrating that maximum surge heights and their spatial distributions are captured sufficiently well in our IFS-GTSM model setup to simulate historical TC storm surge events. For the other case studies, we compared the modeled maximum surge heights to observations and/or estimates from the literature. We found that modeled surge heights are generally lower than observed heights. For most case studies, the difference between the observed and modeled surge heights is less than 0.5 m, from which we conclude that the IFS-GTSM model setup at T799-resolution framework is capable of capturing the large-scale spatial patterns of the maximum surge heights sufficiently well.

In addition, we analyzed the effects of different horizontal resolutions of meteorological forcing data on the simulated maximum surge heights by upscaling the meteorological forcing of the eight selected TC case studies to various coarser resolutions between 0.25° to 1.0°. We found that simulated TC storm surges are lower using coarser resolution datasets, with differences between the highest-resolution and 1.0°-resolution forcing ranging between 0.01 m for Patricia and 1.02 m for Irma. Similar conclusions were reached by Appendini et al. (2013), who forced a wave model with three different atmospheric reanalyses datasets to model significant wave heights. Despite differences in the atmospheric models, they show that wave modeling is improved in finer spatial resolution datasets compared to coarser resolution forcings.

We also observed that the storms with the highest storm surges also generate the largest differences in storm surge heights between the different resolutions. Hence, TCs with high storm surges require high-resolution meteorological forcing for accurate storm surge and impact modeling. Apart from the atmospheric forcing, mesh resolution and bathymetry representation in the hydrodynamic model are also critical in storm surge modeling (Kerr et al. 2013), but the effects of these two elements were not explored here. Therefore, our results should be taken with some caution, as they only serve as a way of assessing the atmospheric resolution effects, rather than a way of validating the hydrodynamic model and providing accurate storm surge height estimates in a particular area.

Furthermore, we examined the relationship between storm surge heights and geographical characteristics known to influence them (Irish et al. 2008; Mori et al. 2014): intensity, TC size, coastal complexity, and coastal slope. It appears that storm surge height is a combination of all these factors. However, in the eight case studies examined in this study, it seems that the geographical characteristics have a larger effect than the TC characteristics: the highest storm surges are found in regions with high coastal complexity and, in general, a small slope. At a local scale, the orientation of the coastline can play a more dominant role in storm surge enhancements: this can be seen for small islands where on one side the surge is positive and negative on the other, both with the same complex coastline.

Despite the limited dataset, there are indications that coastal complexity is an important driver for maximum surge heights and, in turn, the decrease in maximum surge heights in coarser-resolution meteorological forcing datasets. To further test the relationship between coastal complexity and horizontal resolution effects, we propose the use of a hydrodynamic model in which coastal slope and complexity can be (independently) adjusted for the same TC case study. Since coastal topography (e.g., mountainous regions) can also affect wind fields (Raderschall et al. 2008), the coastal complexity should be adjusted simultaneously in the global atmospheric circulation model.

Our findings show that the use of high-resolution meteorological forcing is particularly beneficial for areas prone to high (several meters) TC storm surges, since these high storm surges are reduced most when using coarser-resolution datasets. For TC case studies with surges below 0.5 m, our results suggest that coarser-resolution datasets can be used with limited effects on maximum surge heights.