1 Introduction

Storm surges are abnormal sea level events caused by meteorological conditions. The storm surges that accompany tropical cyclones are particularly destructive, generating water levels of 8 m or more. These have the potential to cause massive loss of life and financial damage and have done so on many occasions in the past. Rappaport (2014) estimated that 49% of the fatalities during an Atlantic tropical cyclone event are due to the accompanying storm surge. Notable storm surge examples include the 1970 Bhola cyclone storm surge, which generated total water levels of around 9 m in the Northern Bay of Bengal and is estimated to have killed upwards of 300,000 people (Murty et al. 1986). Also, Hurricane Katrina (2005), a high profile storm that caused extensive flooding in the city of New Orleans, killing 1833 and causing an estimated $108 billion worth of damage (Knabb et al. 2005).

There are many mechanisms involved in the generation of a storm surge (see Harris 1963; Horsburgh 2011). The main ones are listed below.

  • The inverse barometer effect increases sea level in areas of relatively low surface atmospheric pressure. This is a small part of the surge and is most significant in the open ocean.

  • High wind stress at the sea surface drives water up against (or away from) coastal boundaries, resulting in a higher (or lower) coastal sea level.

  • In the open ocean, waves contribute little towards transport of water. However, as they near the coast and break, momentum is transferred to the water column and water is driven shorewards.

  • Wave run-up effects can contribute towards shoreward water transport and cause additional overtopping of coastal defence structures.

  • The Coriolis effect can affect the surge by diverting wind driven currents towards or away from the coast.

In order to reduce damage, operational forecasting centres must be able to forecast tropical storm surges accurately and in a timely manner (to allow for any necessary precautions and in some cases evacuation). Forecasting models are developed regionally and many are in operation around the world. For example, the National Hurricane Center uses the Sea, Lake and Overland Surges from Hurricanes (SLOSH) model (Jelesnianski et al. 1992) for the US and the Japan Meteorological Agency’s model, which is used primarily for the North West Pacific (Higaki et al. 2009).

It is becoming increasingly important for operational centres to have the ability to accurately forecast storm surges as sea level rise will increase the number of times any surge threshold is reached (IPCC 2014). Some studies suggest that the frequency, location and intensity of tropical cyclones may change with climate change, although there is no consensus. For example, Kossin et al. (2014) suggest that these storms may migrate polewards, changing the locations of areas at risk. Some areas such as Bangladesh are particularly at risk from climate change and sea level rise as the area is comprised of large expanses of low lying, densely populated land (Murty et al. 1986).

Recent decades have seen significant improvement in storm surge forecasting. Much work has been done on developing grid schemes on which to perform finite differencing or finite element techniques. Initially, models used regular, cartesian grids; however, these do not resolve the coastline well. Since the surge is mainly a coastal phenomenon, coupling or nesting finer resolution grids for areas of complex coastal geometry/bathymetry has been studied. Murty et al. (1986) discuss the use of finite differencing methods.

From the 1970s, finite element methods have also been used to model surges which allow for the use of highly irregular, triangular grids that capture coastal geometry more accurately than regular grids. More information on these grids can be found in (Horsburgh 2011) and (Gonnert et al. 2001). These are now the preferred way of dealing with complex coastal boundaries (Horsburgh 2011). The ADCIRC storm surge model uses such finite element methods (Westerink et al. 1992). These methods are useful and contain a great deal of detail but are also computationally expensive which renders them less suitable for operational use.

More recently, finite volume methods have been developed (Dick 1994). This numerical technique turns the usual partial differential conservation equations into discrete algebraic equations over finite volumes. The method has the benefit of being computationally efficient (like finite difference methods), having geometrical flexibility (like finite element methods) and making it easier to comply with conservation laws (e.g. mass, volume, momentum). FVCOM is an example of a model based on finite volume methods that can be used for coastal modelling. The model was developed by Chen et al. (2003) and uses an unstructured, three-dimensional grid.

Storm surge models use sea level pressure and 10-m wind fields as forcing boundary conditions (Horsburgh 2011). Operational forecasting models use idealised wind fields generated parametrically which offer short computation times and dynamical balance. Global atmospheric models are currently unsuitable for real-time forecasting due to the high resolution required to accurately resolve tropical cyclones and long computation times (several hours on the most powerful supercomputers). Input parameters often only consist of a value for central pressure drop (the difference between the surface pressure at the centre of a tropical cyclone and the ambient air pressure) and the radius of max winds. These values can be estimated from techniques using satellite data, such as the Dvorak Method (Dvorak 1975) and from global weather models (Horsburgh 2011). Additionally, data from hurricane reconnaissance flights can be used to estimate atmospheric pressure.

One of the earlier parametric wind field models was suggested by Myers and Malkin (1961) and was based on work by Schloemer (1954) (see Sect. 2 for more information on the Myers model). (Holland 1980) advanced these models through the introduction of the Holland B parameter, allowing more control over different shapes of velocity profile. This model does not realistically model the entire profile however (Willoughby and Rahn 2004), so it was later revised (Holland et al. 2010). Another notable model has been suggested by Willoughby et al. (2006). This model differs from those mentioned previously as it uses a higher number of parameters and also allows for the use of multiple functions in a piecewise fashion.

Remotely sensed data are steadily becoming more readily available and accurate. These data are potentially useful for improving storm surge forecasting (e.g. by using data assimilation techniques). Here, we investigate the effects of using analysis wind fields derived from remotely sensed data to force operational forecast models in place of idealised parametric wind fields. See Sect. 2 for more information on these parametric wind fields.

In this paper, we attempt to answer the following two questions:

  1. 1.

    When is using analysis wind fields derived from remotely sensed data useful and how can it be used?

  2. 2.

    Can using actual analysis wind fields make forecasts of maximum surge height more accurate when compared to using parametric wind fields?

To the author’s knowledge, using observation derived wind fields to modify the sea surface forcing has not previously been investigated for operational storm surge models.

2 Method

2.1 Model

To demonstrate the effectiveness of using analysis wind fields, we use the SLOSH storm surge model. This is the operational storm surge model used by the NHC to forecast storm surges for the US Gulf and East coasts. For this reason, we use the SLOSH storm surge model. Many other models have been used for complex storm surge simulation, for example ADCIRC (Westerink et al. 1992) and POLCOMS (Holt and James 2001); however, the point of this work is to examine how improvement can be made to real-time operational storm surge forecasting.

A brief overview of SLOSH is given here. For detailed information on the inner workings of the model, see (Jelesnianski et al. 1992). SLOSH uses a variation of the linear 2D depth-integrated hydrodynamic equations along with the continuity equation [see Eqs. (1)–(3)] below.

$$\begin{aligned} \frac{\partial u}{\partial t} + u \frac{\partial u}{\partial x} + v \frac{\partial u}{\partial y}&= -g \frac{\partial \eta }{\partial x} - \frac{1}{\rho } \frac{\partial P_A}{\partial x} \nonumber \\&\quad + \frac{1}{\rho D} (\tau _{sx} - \tau _{bx}), \end{aligned}$$
(1)
$$\begin{aligned} \frac{\partial v}{\partial t} + u \frac{\partial v}{\partial x} + v \frac{\partial v}{\partial y}&= -g \frac{\partial \eta }{\partial y} - \frac{1}{\rho } \frac{\partial P_A}{\partial y} \nonumber \\&\quad + \frac{1}{\rho D} (\tau _{sy} - \tau _{by}), \end{aligned}$$
(2)
$$\begin{aligned} \frac{\partial \eta }{\partial t} + \frac{\partial (Du)}{\partial x} + \frac{\partial (Dv)}{\partial y}&= 0, \end{aligned}$$
(3)

where u and v are the components of flow in the x and y directions, t is time, g is gravitational acceleration, \(\eta \) is the level of the free surface, D is the fluid depth, \(\tau _{sx}\) and \(\tau _{sy}\) are the surface wind stresses in the x and y direction, \(P_A\) is the atmospheric pressure, and \(\rho \) is the fluid density.

An Arakawa-B (Messinger and Arakawa 1976) finite differencing scheme is used and the ocean surface is modelled on a polar, hyperbolic or elliptic grid, depending on the chosen model domain. This grid allows for the use of finite differencing whilst also increasing resolution in key areas such as near the coast. The grid properties are pre-defined and specific for each basin.

The parametric wind fields are based on those by Myers and Malkin (1961). They are generated using the following three equations:

$$\begin{aligned}&\frac{1}{\rho _a}\frac{\hbox{d}p}{\hbox{d}r} = \frac{k_{s}V^2}{\sin(\theta )} - V\frac{\hbox{d}V}{\hbox{d}r}, \end{aligned}$$
(4)
$$\begin{aligned}&\frac{1}{\rho _a}\frac{\hbox{d}p}{\hbox{d}r}\cos(\theta ) = fV + \frac{V^2}{r}\cos(\theta ) - V^2\frac{\hbox{d}\theta }{\hbox{d}r}\sin(\theta ) + k_nV^2, \end{aligned}$$
(5)
$$\begin{aligned}&V(r) = V_R\frac{2Rr}{R^2+r^2}, \end{aligned}$$
(6)

where p is the atmospheric air pressure, R is the radius of maximum winds, \(V_R\) is the maximum wind speed, V(r) is the wind speed at radius r, \(\theta \) is the inflow angle at a given location, and \(\rho _a\) is the density of air at the surface.

Both p and R are more likely to be known than \(V_R\) so these are used to first approximate \(V_R\) using lookup tables. These values can then be used with Eq. (6) to calculate the wind speed profile V(r) and therefore to solve Eqs. (4) and (5) for p and \(\theta \). The discrepancy between the calculated and analysis p values can then be reduced by changing the values of \(V_R\) until the difference is below a specific threshold.

2.2 Data

The analysis wind fields used in this paper is the Multi-Platform Tropical Cyclone Surface Wind Analysis (MTCSWA) product developed by the NHC (Knaff et al. 2011). The MTCSWA wind fields are 2D wind datasets (u and v components) which are generated by blending together five different observation datasets, including ASCAT/QuikSCAT scatterometry, 2D flight-level winds estimated from infrared imagery and 2D winds created from Advanced Microwave. These wind datasets are comprised of 10 m, 1-min averaged winds at a resolution of 0.1° (latitude and longitude). They are available for storms since 2006 at six-hourly intervals.

Figure 1 shows how the MTCSWA analysis wind fields differ from the parametric wind fields used by SLOSH. The analysis wind fields are generally stronger than the parametric wind fields, especially near the centre of the tropical cyclone (around the eye wall). Winds are moving around the centre of the storm in a roughly anticlockwise direction.

Fig. 1
figure 1

Analysis minus modelled wind speed for a Hurricane Ike 12 h before landfall, b Hurricane Ike 6 h before landfall, c Hurricane Sandy 12 h before landfall, d Hurricane Sandy 6 h before landfall, e Hurricane Gustav 12 h before landfall, f Hurricane Gustav 6 h before landfall. Wind direction is approximately anticlockwise around the centre of the tropical cyclone

For use in SLOSH, the analysis datasets had to be converted to 10-min winds. To do this, proportional adjustments were applied to the data based on the recommendations in (Harper et al. 2010). The factor used to adjust the data depended on whether a specific datapoint is located over the ocean (0.93), the land (0.84) or within 20 km of the coast (0.885).

We ran hindcasts of three notable tropical cyclone storm events: Hurricane Ike (2008), Hurricane Gustav (2008) and Hurricane Sandy (2012). For each storm, best track data from IBTrACS (Knapp et al. 2010) were used for storm track data and input parameters for the generation of parametric wind fields. Each hindcast is run from 24 h before landfall to 12 h after landfall. Figure 3 shows the track and category for each storm during the simulation period. We have used the following two methods to modify the forcing in the storm surge model.

2.3 Method A

For this method, the wind forcing on the model sea surface is changed by directly replacing parametric wind fields with analysis wind fields for a set period of time. Model runs are performed using different time periods to evaluate exactly when using this method might be useful. The time periods run from 24 h before landfall up until 18, 12 and 6 h before landfall (see Table 1). After the time period has ended, the model will once again use parametric wind fields. To maintain stability, a linear interpolation scheme is used to smoothly transition between subsequent analysis wind fields and back to the parametric wind fields. Each element of a wind field is interpolated to its corresponding element (same distance and bearing from storm centre) in the next wind field.

The idea of the method is to simulate what is possible in a real-time operational setting, i.e. using available knowledge of near-present and past wind fields to force the model and parametric wind fields (derived from hurricane forecasting methods) where future analysis wind fields are obviously unavailable. This changes the model sea surface state at a specific point in time through changing the wind forcing. In a real-time setting, this point in time would be approximately equivalent to the present. The hope is that any sea surface modifications will influence the future model sea surface state and generate a surge forecast with increased accuracy.

Table 1 Model run designations and what they refer to

For a summary of model runs, see Table 1. An illustration of Method A is shown in Fig. 2.

2.4 Method B

In an operational setting, Method A only changes the forcing at the sea surface for a period of time in past. Consequently, some changes to the modelled storm surge may be lost, especially when analysis wind fields are used far from landfall. To remedy this, we would like to be able to use the information available in past analysis wind fields to modify future wind fields. For this method, we take the most recent analysis wind field available (in a real-time setting) and use the differences between this dataset and its corresponding parametric wind field to proportionally change future wind fields.

We begin by generating a parametric wind field at the same point in time (t) as the present analysis wind field. For every point in the wind field, the following innovations are calculated:

$$\begin{aligned} e_v (x,y,t)&= a_v (x,y,t)- s_v (x,y,t), \end{aligned}$$
(7)
$$\begin{aligned} e_u (x,y,t)&= a_u (x,y,t) - s_u (x,y,t), \end{aligned}$$
(8)

where \(a_u(x,y,t)\) and \(a_v(x,y,t)\) are u (eastwards) and v (northwards) components of the analysis wind field at time t and location (xy)and \(s_u(x,y,t)\) and \(s_v(x,y,t)\) are the u and v components of the parametric wind field at time t and location (xy). These (xy) coordinates are relative to the centre of the storm, i.e. their origin is at the storm centre. (xy) refers to the position that is x-units to the east and y-units to the north of the storm centre. This means that the innovations also move with the storm. The future wind field at time \(t+n\) is then generated using:

$$\begin{aligned} m_v(x,y,t+n)&= s_v (x,y,t+n) + \frac{e_v(x,y,t)}{s_v(x,y,t)} s_v(x,y,t+n) \end{aligned}$$
(9)
$$\begin{aligned} m_u(x,y,t+n)&= s_u(x,y,t+n) + \frac{e_u(x,y,t)}{s_u(x,y,t)} s_u(x,y,t+n), \end{aligned}$$
(10)

where \(m_v(x,y,t+n)\) and \(m_u(x,y,t+n)\) are the resulting wind components used at time \(t+n\). After time t, the wind fields are modified up until landfall. See Fig. 2 for an illustration. We generate an entirely new modified parametric wind field at time t + n by calculating \(m_v(x,y,t+n)\) and \(m_u(x,y,t+n)\) at every point in the wind dataset.

This method assumes that the innovations \(e_v(x,y,t)\) and \(e_u(x,y,t)\) as a proportion of the parametric wind components do not change up until landfall. Although this underlying assumption is rather simple, we hope that by using it we will be able to make large-scale spatial corrections to the parametric wind fields.

Fig. 2
figure 2

Illustrations of Methods A (left) and B (right) for 12 h before landfall (see A12 and B12 in Table 1). Time T is equivalent to the present in a real-time forecasting setting

Fig. 3
figure 3

IBTrACS best track estimates of location and intensity category at times around landfall for the three case studies in this paper: Ike, Gustav and Sandy. ET denotes where the storm was no longer tropical. Grey areas show land, and white areas show ocean

The specific model runs performed for each of the three storms are shown in Table 1. We also perform a control run using just parametric wind fields for comparison purposes.

Table 2 Mean absolute errors for wind u-component fields (parametric and those generated by Method B)
Table 3 Mean absolute errors for wind v-component fields (parametric and those generated by Method B)

To test the simple assumption used for Method B, we can use analysis wind fields to calculate approximate error fields for parametric and modified wind fields. We calculate these error fields at 0, 6 and 12 h before landfall. The mean absolute errors (MAE) of these error fields are shown in Tables 2 and 3. Tables 2 and 3 show MAE values over the wind u- and v-component fields.

Generally, the parametric wind fields are the worst performing. The best performing B method wind fields are those that are generated using innovations from the most recent analysis wind fields. This suggests that our assumption that the proportional innovations (e.g. \(\frac{e_v}{s_v}\) from Eqs. 910) are constant over time has some validity in the short term (6–12 h). The longer the modifications are applied for, the larger the MAE tend to get, although they are still smaller than the parametric MAEs in many cases.

Table 4 Average absolute difference from observations for each model run during each individual storm event and all storm events combined

These results suggest that, even if the proportional innovations are not constant over time, they change slowly enough such that our underlying assumption for Method B can be used for some period of time into the future.

3 Results

3.1 Statistics

The following definition of surge is used:

$$\begin{aligned} R(t) = P(t) - O(t), \end{aligned}$$
(11)

where R(t) is the non-tidal residual at time t, P(t) is the predicted water level (due to tides) at time t, and O(t) is the observed or modelled water level at time t. For comparison purposes, we take the maximum surge heights to be the maximum values of R(t) during an entire storm event. Throughout this work, we refer to this as MSH. We use this statistic because of it is importance to forecasting. For validation, we use tide gauge data (from the NOAA database).

Table 5 Standard deviation of absolute difference from observations for each model run during each individual storm event and all storm events combined

For each storm event, six tide gauge sites have been chosen for model validation (e.g. see Fig. 4). The observations from these sources can be used for investigating any improvement in forecast accuracy. The tide gauges are chosen based on data availability and also to give a good spatial idea of the surge heights in the basin.

3.2 Hurricane Ike (2008)

Ike made landfall as a category 2 hurricane at Galveston, Texas, on 13 September 2008. See Fig. 3 for the track and intensity categories of Ike over the simulation period. The storm caused widespread damage far along the Louisiana and Mississippi coastlines. The total cost of the hurricane is estimated to be $29.5 billion. A total of 195 people are estimated to have died, with 112 of these being in the USA (Berg 2014).

Figure 4 shows the MSH values for the control, A6, A12, B6, B12 and B18 model runs over the domain used for hurricane Ike. A18 is visually similar to the control run, so it is not shown in this figure. The tide gauges used for this storm event are Freeport (FP), Galveston Pleasure Pier (GP), Eagle Point (EP), Sabine Pass North (SP), Port Arthur (PA) and Calcasieu Pass (CP). Their approximate locations are shown in Fig. 4. For convenience, we will refer to these tide gauges by their abbreviations. Figure 5 shows modelled minus observed MSH values at each tide gauge. We use these two figures for the discussion below. The A24 model was found to be almost identical to the control run, so its results are not discussed here.

From Fig. 4, it can be seen that the spatial structure of the surge is similar for all model runs, but there are some important differences. All modified models, see a spreading of the surge westwards along the section of coastline between GP and FP. This is most significant for the B method model runs and A6 and can be seen at the tide gauges as an increase in MSH at FP for these model runs. This increase also means that the B model runs and A6 are around 40 cm closer to the observations at FP than the control, suggesting that this westward increase in the surge might be more representative of the true sea surface state.

All B methods and A6 also see a significant increase in the sea level around EP. This is largest for the B methods, where there is a large improvement in accuracy of the model output (B6 improves over the control by 0.88 m, B12 by 1.06 m and B18 by 0.95 m). All model runs also see small improvement (or no change) at GP and SP.

Comparisons at the PA tide gauge suggest that A6 and the B method model runs perform poorly. Here, the B methods and A6 all increase the MSH in the Sabine Lake area. This increase is detrimental to the accuracy of models at this tide gauge, dragging the modelled MSH up to 0.51 m (for B18) further from the observations than the control. This could be due to the complex coastal geometry in this area and the fact that the tide gauge is situated on an inland body of water. The water levels at this location are somewhat dependent on the flow through the channel of water on which SP sits meaning that any errors in modelling this flow might affect the modelled MSH at PA. It is also worth noting that the control run performed very well at PA, meaning that any changes in MSH would probably result in a worse model output.

In general, the effect of using the analysis wind fields (both Method A and B) is to increase the MSH at all tide gauge locations except at CP. A decrease here makes the accuracy of the model output worse, especially for A6 and B18 where the modelled MSH at CP is 0.21 and 0.31 m (respectively) further from the observed MSH.

Of the A methods, A6 has the most effect on the model output, whereas A18 has very little effect. Table 4 shows the average difference from observations for each model run. On average, A6 improves the model forecast by 0.08 m, but A12 and A18 have much less of an effect. The B methods perform the best, especially B12, which improves the model output by 0.25 m. Finally, Table 5 shows that the standard deviation of these differences is generally similar to the control or much lower.

Fig. 4
figure 4

MSH for Hurricane Ike hindcasts. Black stars show the locations of the tide gauges used

Fig. 5
figure 5

Modelled–observed MSH at each tide gauge for Hurricane Ike. Negative values indicate where the model underestimates the observed MSH and positive values indicate overestimation. Values in grey at the top of each tide gauge show the observed MSH at that location

3.3 Hurricane Sandy (2012)

Sandy made US landfall near Brigantine, New Jersey, on 29 October 2012. Although not technically a hurricane at the point of landfall, it is the second costliest storm in US history, behind Katrina. See Fig. 3 for the track and intensity categories of Sandy over the simulation period. At its peak size, it was the largest Atlantic hurricane on record. The storm is estimated to have caused $75 billion worth of damage and to have killed 233 people across eight countries (Neria and Shultz 2012).

Figure 6 shows the MSH for the control, A6, A12, B6, B12 and B18 model runs of Hurricane Sandy along with the locations of the six chosen tide gauges: Ocean City Inlet (OC), Atlantic City (AC), The Battery (TB), Montauk (MT), Kings Point (KP) and Bridgeport (BP). Figure 7 shows modelled minus observed MSH values at each tide gauge. Once again, the A24 model was found to be almost identical to the control run, so its results are not discussed here. Note that results for AC and OC were taken from a different model domain than shown in Fig. 6 due to better representation at these locations. B6 is visually similar to B12, so only B12 is shown.

A6, A18 and the B model runs all result in various levels of increase in the estuary areas around KP, BP and TB as well as a spreading of the surge southward along the coastline between AC and OC. The increase around KP, BP and TB is most extreme for B18, with as much as 1.16 m being added to the MSH value for BP (causing it to be further from the observations).

The southward spreading towards AC and OC leads to improvements for the B model runs and A6, especially at AC. Figure 1 shows a large area in the upper left quadrant of the storm at 12 h before landfall where the analysis field is more than \(15 ms^{-1}\) stronger than the parametric field. These stronger winds will transfer more momentum to the sea surface towards the coastline between AC and OC, giving the increase in MSH that we see at these locations. This feature of the storm looks to be fairly persistent as it is still present 6 h later.

Interestingly, A12 differs from all the other runs as it reduces the MSH around KP, BP and TB when compared to the control run. This could be explained by a band of winds over the area that are around 5 ms\(^{-1}\) weaker in the analysis field than in the parametric field at 12 h before landfall. In the control model run, the stronger parametric winds in this area start an earlier build up of surge at 12 h before landfall. In the A6 model run, this does not happen, but the stronger central winds around the storm centre could compensate for this as the storm approaches landfall.

Similarly to Ike, the general trend is for all of the methods to induce an increase or little difference in MSH at each tide gauge. The main exception to this rule is A12 at KP and BP.

Table 4 shows the average difference from observations for each model run for Hurricane Sandy. On average, A6 improves the model forecast by 0.12 m when compared to the control. A18 again has a smaller effect on the output, however it is more significant than for Ike. The B methods once again perform the best, with all three improving the model output by 0.21–0.24 cm, which is a good result. Table 5 shows that, generally speaking, the standard deviation of these differences is lower than for the control except for at A12, where it is significantly higher. A12 also performs poorly in an average sense, leading to the model output being 0.06 m further from the observations than the control on average.

Fig. 6
figure 6

MSH for Hurricane Sandy hindcasts. Black stars show the locations of the tide gauges used

Fig. 7
figure 7

Modelled–observed MSH at each tide gauge for Hurricane Sandy. Negative values indicate where the model underestimates the observed MSH and positive values indicate overestimation. Values in grey at the top of each tide gauge show the observed MSH at that location

3.4 Hurricane Gustav (2008)

Gustav was the second most destructive storm of the 2008 season, behind Ike. It made US landfall as a category 2 hurricane at Cocodrie, Louisiana on September 1, 2008. See Fig. 3 for the track and intensity categories of Gustav over the simulation period. The storm killed an estimated 112 people in total and caused $4.3 billion of damages in the USA (Beven and Kimberlain 2014).

Figure 8 shows the MSH for the control, A6, A12, B6, B12 and B18 model runs of Hurricane Gustav along with the locations of the six chosen tide gauges: LAWMA Amerada Pass (AP), Grand Isle (GI), Pilots Station (PS), Shell Beach (SB), Bay Waveland Yacht Club (BW) and Dauphin Island (DI). Figure 9 shows modelled minus observed MSH values at each tide gauge. Again, the A24 model was found to be almost identical to the control run, so its results are not discussed here. A18 is also visually very similar to the control run, so it is not shown in Fig. 8.

A12 looks similar to the control run, and follows the control closely at the tide gauges, except for SB where there is a small increase in MSH (and consequently improvement). A6 and the B methods give an increase in MSH between SB and BW as well as an eastwards spreading along the coastline between BW and DI. Similarly to Hurricane Sandy, this increase is very noticeable for B18. The tide gauges show that the control run significantly underestimates MSH at BW, SB and DI, so this is increase improves the model output when compared to the control for A6 and the B model runs. Improvement at BW is an important result as the control run forecast a very small surge for the area.

B6 and B12 both reduce the MSH in the area immediately around the GI tide gauge (although B12 causes and increase further inland). The GI tide gauge shows that the control overestimated the MSH at this location, so this reduction means that B6 and B12 are both closer to the observations. On the other hand, B18 increases the MSH at this location by over 0.6 m. Figure 1 shows an area surrounding the storm centre where the analysis winds are slightly weaker than the parametric winds (both for 6 and 12 h before landfall). The storm passes just south of GI before landfall, and so it will be this weaker area of the modified wind field that passes over the tide gauge. These weaker winds could be leading to the reduction (and improvement) in the modelled MSH at GI for B6 and B12.

Around AP, the B methods cause a modest increase in MSH of 0.27–0.51 m, which leads to worse results at this location.

Similar to Hurricane Ike, of all the A methods A6 has the largest effect on MSH, whereas A12 and A18 have a much smaller effect (with the effect of A18 being almost negligible at most tide gauges). B18 once again causes drastic increases in MSH over a large area.

Table 4 shows the average difference from observations for each model run for Huricane Gustav. Almost all model runs see an average improvement when compared to the control (although this improvement is small for A12, A18 and B18). B6 and B12 perform very well, improving the model output at the tide gauge locations by 0.36 and 0.38 m, which is a good result. B6 and B12 also reduce the standard deviation of these differences significantly (Table 5).

Fig. 8
figure 8

MSH for Hurricane Gustav hindcasts. Black stars show the locations of the tide gauges used

Fig. 9
figure 9

Modelled–observed MSH at each tide gauge for Hurricane Gustav. Negative values indicate where the model underestimates the observed MSH and positive values indicate overestimation. Values in grey at the top of each tide gauge show the observed MSH at that location

4 Conclusions

In this paper, we attempted to answer the question: To what extent and when is using analysis wind fields in an operational storm surge model useful?

We proposed two methods for using analysis wind fields. Our first method, Method A, simply replaces the parametric wind forcing with analysis wind data from 24 h before landfall to a set point in time. Essentially, this method can be thought of as changing the sea surface state at a point in time by changing the wind forcing up until that point. Method B does the same but then attempts to extrapolate this wind forcing into the future by using the most recently available analysis data. To test these methods, we ran hindcasts for three storms: Hurricane Ike, Hurricane Sandy and Hurricane Gustav. We tested the methods over different time periods to see exactly when they might be useful.

We were able to test the wind fields modified by Method B through comparison to analysis wind fields. We found that there was a general improvement to the average errors of the wind fields, especially in short time periods. However, the longer the extrapolation was applied for, the larger the errors became.

Of the A methods, A6 performed the best, improving upon the control by 0.15 m on average. A12 performed generally quite well too but was let down by particularly bad performance at the Kings Point and Bridgeport tide gauges for Hurricane Sandy. A18 generally followed the control quite closely, and A24 was almost identical to the control. In part, this is unsurprising as you would expect to see larger responses in the model sea surface to changes in the wind forcing in shallower water. As a storm approaches landfall, the ocean over which it travels becomes shallower, meaning that the surface wind stress terms in Eqs. (1)–(3) become larger and more significant. Additionally, the model sea surface has a ‘memory’, i.e. a time period over which changes to the sea surface height will diminish. This study suggests that this time period is somewhere around 12 h, with changes before this point becoming small or negligible over time.

B6 and B12 performed the best of all model runs, improving output on average by 0.25 and 0.29 m and also more than halving the standard deviation of the errors. B18 also improved the model output on average, but behaved quite erratically, often significantly overestimating the storm surge. This broadly lines up with our analysis of the quality of the modified wind fields, and since B18 implemented the longest extrapolation of innovations, the method would also have introduced the largest errors to the model (of the B methods). For B6 and B12, modifications made to wind fields around landfall (the most influential for surge generation) are based on innovations from more recent analysis wind fields.

Importantly, our surge models often increased the modelled storm surge at locations where the control only gave a small storm surge. These changes to the spatial structure of the storm surge were successfully verified by tide gauge observations. For example, the Bay Waveland Yacht Club tide gauge saw increases of over 1 m in its modelled surge height, where previously the modelled storm surge was relatively small (1.4 m). Similarly, the control model run gave a surge height of 0.94 m at Freeport (Hurricane Ike). This was increased by nearly 0.5 m by the B6 and B12 model runs, giving a smaller error. This is an important result because an incorrectly small storm surge prediction at a location might mean that necessary precautions are not taken to protect people and property.

Our results suggest that the use of either method in a real-time setting could give forecasters useful information regarding the spatial structure of the storm surge, although perhaps only 12 h before landfall and later. Investigating exactly how the proportional innovations used for Method B propagate through time would be useful for improving upon this method, especially for forecasts performed earlier than 12 h before landfall.

There are some limitations to note. Our results were based on sets of six tide gauges. This is not a large dataset and evaluation of these methods over more tide gauges and model domains would be useful. The model output is also constrained by the quality of the best track data used and the analysis wind fields. It is important to note that in a real-time operational setting best track data would not be available. There are large errors associated with the forecasting of hurricane track and intensity. For example, during the 2000–2008 period, the track forecast error for the atlantic basin (difference between forecast position and best track position) was around 60 nmi for a 24-h forecast and 30 nmi for a 12-h forecast (Rappaport et al. 2009). Additionally, the error in the 24-h intensity forecast over the same period is around 8–11 knots. In this study, we wanted to investigate the effect of modifying only the wind fields in the model. However, our methods might yield better results when future hurricane properties are more uncertain. There is scope to investigate how these methods perform for forecasted storm parameters rather than best track parameters in future work.

Finally, although adjustments were made to the wind forcing, no adjustment was made to the underlying pressure field, which also has a small effect on the storm surge. In coastal areas, the wind forcing is more significant; however, it might be useful to investigate inversion methods to generate pressure fields based on the analysis wind fields. For example, Brown and Levy (1986) developed a method for estimating atmospheric pressure fields from satellite-derived winds.