1 Introduction

Coastal flooding can damage the lower portions of buildings, their contents, and other infrastructure. An accurate estimate of future water levels is needed to facilitate flood-resistant design and performance evaluation (e.g., define the design flood level and estimate the performance of structures under coastal flooding). Traditional methods to analyze extreme coastal water levels are often based on the assumption that coastal water level time series are stationary (ST) (e.g., Coles 2001). However, rising sea levels contribute to coastal flooding, thereby invalidating this assumption (Tebaldi et al. 2012; Zervas 2013; Kopp et al. 2014; Vitousek et al. 2016; Ghanbari et al. 2019, 2021; Taherkhani et al. 2020; Baldan et al. 2022). The global sea level has been increasing over the past century (Church and White 2006; Cazenave and Llovel 2010; IPCC 2014; Hay et al. 2015). The rate of sea level rise (SLR) is anticipated to increase globally and regionally, particularly under greenhouse gas emission scenarios (Nicholls and Cazenave 2010; Parris et al. 2012; IPCC 2014; Kopp et al. 2014; Hall et al. 2016; Sweet et al. 2017; Taherkhani et al. 2020).

Several studies over the course of the past few decades have accounted for the impact of SLR in nonstationary (NS) coastal flood frequency analysis. Their methodologies can be classified into two types. One involves fitting NS probability distributions to water level data, where the parameters of the NS probability distributions are modeled as functions of time (e.g., Katz et al. 2002; Mudersbach and Jensen 2010; Obeysekera and Park 2013; Obeysekera and Salas 2014; Salas and Obeysekera 2014; Luke et al. 2017; Razmi et al. 2017; Ghanbari et al. 2019, 2021; Baldan et al. 2022). The impact of SLR is implicitly and statistically incorporated in the parameters of the NS probability distribution. Flood hazard curves (water levels vs. return periods) are developed by extrapolating the parameter functions of the NS probability distribution into the future. Note that this extrapolation is valid only if water level trends remain the same. However, sea level trend projections are likely to show an acceleration (Nicholls and Cazenave 2010; Parris et al. 2012; IPCC 2014; Kopp et al. 2014; Hall et al. 2016; Sweet et al. 2017), which can affect the probabilistic behavior of water levels in the future. As such, the estimated functions for NS probability distribution parameters are anticipated to change, rendering flood hazard curves developed using extrapolation idealized and unlikely to be practical. The other methodology involves detrending water levels (removing the SLR trend), and then fitting ST probability distributions to the detrended water level data (e.g., Kirshen et al. 2008; Menéndez and Woodworth 2010; Tebaldi et al. 2012, 2021; Zervas 2013; Sweet et al. 2014, 2022; Taherkhani et al. 2020). An ST probability distribution is defined by parameters that are fixed (time independent). In this methodology, the nonstationarity is accounted by the detrended SLR. The limitation of this methodology is that the concept of return period (RP) in an NS context is not considered. The developed flood hazard curves are only to report an RP for a given water level at a specific time, which is obtained from the fitted ST probability distribution and an estimated SLR.

The water level data for coastal flood frequency analysis is usually collected by one of two methods: annual maximum series and peak-over-threshold (POT). An annual maximum series is constructed using the highest recorded water level value for each year, which means that the sample size is equal to the number of years of data. The primary advantages of the annual maximum series method are its simplicity and the independence of extracted extremes (Tabari 2021). However, using only one data point per year will exclude potentially useful information, as there might be more than one large recorded water level in a year relative to other years’ maxima (Lang et al. 1999; Bezak et al. 2014; Tabari 2021). An alternative to the annual maximum approach, the POT method, has been widely used in recent coastal flood frequency analyses (e.g., Tebaldi et al. 2012, 2021; Zervas 2013; Bezak et al. 2014; Ramzi et al. 2017; Talke et al. 2018; Ghanbari et al. 2019, 2021; Taherkhani et al. 2020). The POT method consists of retaining all peak water level values above a certain truncation level, usually referred to as the threshold (Lang et al. 1999). Thus, the POT method is not limited to one data point per year and allows for a more rational selection of extreme events. The main challenges of the POT method are choosing an appropriate threshold value and assuring the independence of threshold-exceeding data (Bezak et al. 2014). The appropriate threshold can be selected via multiple approaches, including mean excess plot (e.g., Davison and Smith 1990; Lang et al. 1999; Coles 2001; Bommier 2014; Ramzi et al. 2017) and average number of over-threshold events (e.g., Lang et al. 1999; Nadal-Caraballo et al. 2016). The independence of threshold-exceeding data can be satisfied by using physical bases to set the time interval between peaks (e.g., Lang et al. 1999; Méndez et al. 2006; Nadal-Caraballo et al. 2016; Ghanbari et al. 2019, 2021). For threshold-exceeding data, a generalized Pareto distribution (GPD) is commonly used to model its probabilistic behavior (e.g., Davison and Smith 1990; Stedinger et al. 1993; Lang et al. 1999; Coles 2001; Méndez et al. 2006; Eastoe and Tawn 2010; Katz 2013; Silva et al. 2014; Nadal-Caraballo et al. 2016; Ramzi et al. 2017; Salas et al. 2018; Baldan et al. 2022; Pan et al. 2022).

The primary goal of this paper is to develop a methodology for NS coastal flood frequency analysis that produces flood hazard curves that can provide a complete probabilistic account for future coastal flood hazards. This involves (1) estimating SLR to detrend water level data, (2) using the POT method to estimate the probabilistic behavior of detrended water levels, (3) incorporating the existing sea level trend projections in RP calculation, and (4) describing the NS behavior of RPs. Critical features of this paper that distinguish it from previous studies are (1) incorporation of sea level trend projections, which arise from the consequences of climate change, in frequency analysis, and (2) consideration of the impact of NS behavior of RPs in flood hazard curve development. This methodology is applied to two case studies using water level data recorded by National Oceanic and Atmospheric Administration (NOAA) stations (NOAA 2023) near Boston, Massachusetts, and New York City, New York.

2 Methodology

In this section, the methodology for NS coastal flood frequency analysis that produces flood hazard curves is presented. This methodology includes SLR estimation, POT analysis for detrended water levels, annual exceedance probability estimation, and return period calculation incorporated with sea level trend projection.

2.1 Sea level rise estimation

Analogous to NOAA (2023), the existing SLR trend is estimated using monthly mean sea levels recorded by NOAA stations. It is worth noting that regular seasonal fluctuations from coastal ocean temperatures, salinity, wind, atmospheric pressure, and ocean currents are removed from the monthly mean sea levels (NOAA 2023). The estimated SLR is used to detrend recorded water levels during coastal floods, as described in the next section. Analogous to Obeysekera and Park (2013), Goddard et al. (2015), Nadal-Caraballo et al. (2016), Razmi et al. (2017), Baldan et al. (2022), Sweet et al. (2022), and NOAA (2023), the SLR trend is estimated by linear regression as

$$y={\beta }_{0}+{\beta }_{1}\bullet t+\varepsilon$$
(1)

where \(y\) is the sea level; \(t\) represents the recorded time; \({\beta }_{0}\) and \({\beta }_{1}\) are the regression intercept and slope; and \(\varepsilon\) represents regression residuals, which are assumed to be normal, homoscedastic (equality of variances), and independent (no serial correlation). However, based on the case studies shown in Sect. 3 (Results and Discussions), neither parametric (e.g., ordinary and weighted least squares) nor nonparametric (e.g., Theil-Sen (Theil 1950)) estimators can yield the homoscedastic and independent residuals from the Breusch-Pagan test (Breusch and Pagen 1979) and correlogram (Serago and Vogel 2018). One way to mitigate the heteroscedasticity of regression residuals is through data transformation (Helsel et al. 2020). Since, the SLR is estimated linearly, data transformation of time is accomplished by dividing Eq. (1) by \(t\) on each side:

$$\frac{y}{t}={\beta }_{0}\bullet \frac{1}{t}+{\beta }_{1}+\frac{\varepsilon }{t}$$
(2)

Denoting \(\frac{1}{t}\) as \(t{\prime}\), and \(\frac{y}{t}\) as \(y{\prime}\), Eq. (2) can be rewritten as

$$y^{{\prime }} = \beta _{0} \bullet t^{{\prime }} + \beta _{1} + \varepsilon ^{{\prime }}$$
(3)

where \(\varepsilon ^{{\prime}}\) (equal to \(\frac{\varepsilon }{t}\)) is homoscedastic. To address the presence of serial correlation in \(\varepsilon ^{{\prime}}\), Helsel et al. (2020) recommend grouping the data into time intervals and computing a summary statistic for the interval, such as mean or median, and then using these summary statistics in the regression. This is suggested because the dependence that exists in \(\varepsilon ^{ {\prime}}\) is an indication of considerable redundancy in the data (Helsel et al. 2020). In this paper, the mean values of sea levels at six-month intervals are used for regression.

2.2 Peak-over-threshold analysis for detrended water levels

A common practice when using the POT method is to remove the trend in the extreme dataset and then fit the probability distribution of the detrended extremes (e.g., Kirsten et al. 2008; Zervas 2013; Sweet et al. 2014, 2022; Nadal-Caraballo et al. 2016). The estimated SLR trends are used to detrend the hourly water levels recorded at the NOAA stations, which are obtained from tide gauges that measure the water level with respect to a local, fixed reference point on land. As such, these hourly water levels include (in addition to the SLR and local vertical land motion) astronomical tide height, storm surge height, and limited wave setup (FEMA 2007). The detrended hourly water levels are calculated as

$$z={z}_{t}-({\beta }_{0}+{\beta }_{1}\bullet t)$$
(4)

where \({z}_{t}\) is the hourly water level recorded at time \(t\); \({\beta }_{0}\) and \({\beta }_{1}\) are obtained by Eq. (3). These detrended hourly water levels are used in frequency analysis.

The peaks of the detrended hourly water levels (extreme values) that are above an appropriate threshold need to meet the independence condition as a prerequisite for frequency analysis and the Poisson process assumption (e.g., Davison and Smith 1990; Lang et al. 1999; Nadal-Caraballo et al. 2016; Ramzi et al. 2017). In this paper, the independence condition is defined from both a physical and statistical perspective. From a physical standpoint, Nadal-Caraballo et al. (2016) proposed to meet the independence condition of historical extreme water levels for the POT method by imposing a two-day time interval from the end of one storm to the start of the next. Note, that this imposed time interval does not define the time between peaks, which would be larger than two days. In this paper, a three-day time interval between peaks is selected to meet the independence condition, which is also recommended by Méndez et al. (2006), Sweet et al. (2014), Ghanbari et al. (2019), and Ghanbari et al. (2021). For a statistical approach, a correlogram (Serago and Vogel 2018) is used to evaluate the independence of peaks of detrended hourly water levels above the selected threshold (hereinafter referred to as POT water levels).

2.2.1 Threshold selection

In the POT method, it is important to select an appropriate threshold – one that is high enough to meet the model assumptions that the POT water levels are independent and the occurrence process can be described as a Poisson process. One commonly used method to select a threshold is interpreting the mean excess plot, also known as the mean residual life plot (e.g., Davison and Smith 1990; Lang et al. 1999; Coles 2001; Bommier 2014; Ramzi et al. 2017; Pan et al. 2022). A mean excess plot shows the threshold versus the mean value of threshold-exceeding peaks, which is

$$\frac{1}{{n}_{u}}\sum_{i=1}^{{n}_{u}}({x}_{i}-u)$$
(5)

where \(u\) is the threshold, \({x}_{i}\) is the \(i\) POT water level larger than \(u\), and \({n}_{u}\) is the number of POT water levels larger than \(u\). If the GPD provides a valid approximation of the POT water levels larger than \(u\), the mean excess plot should be approximately linear (Davison and Smith 1990; Lang et al. 1999; Coles 2001; Bommier 2014; and Ramzi et al. 2017). In addition to the mean excess plot, the average number of over-threshold events can provide guidance for threshold selection (Lang et al. 1999).

2.2.2 Generalized Pareto distribution

A GPD is used to model the probabilistic behavior of the POT water levels. The cumulative distribution function of the GPD for a given POT water level value is

$${F}_{X}(x)=1-{\left[1+\xi \left(\frac{x-u}{\sigma }\right)\right]}^{-\frac{1}{\xi }}$$
(6)

where \({F}_{X}(x)\) is the probability of the POT water level (\(X\)) not exceeding a given value (\(x\)); and \(\xi\), \(\sigma\), and \(u\) are shape, scale, and threshold (also known as location) parameters, respectively. If an NS trend exists in the POT water levels (e.g., estimated using linear regression), two methods can be used to incorporate the NS trend in the GPD parameter estimation:

1) Derive the moments of the POT water levels conditioned on time, such as by modeling the mean (conditioned on time) as a linear regression. Then, estimate the GPD parameters using these conditional moments via the method of moments. Examples can be found in Salas et al. (2018). Serago and Vogel (2018), Hecht and Vogel (2020), and Hecht et al. (2022). The uncertainty of GPD can be quantified using the prediction bounds of the conditional moments of the POT water levels.

2) Model the GPD parameters and/or their transformations (e.g., exponential, logarithmic) as functions of time. Then, estimate each of these parameters using the maximum likelihood method. Examples can be found in Coles (2001), Obeysekera and Park (2013), Salas and Obeysekera (2014), and Ramzi et al. (2017). The uncertainty of GPD can be calculated using the confidence bounds of GPD parameters.

If there is an ST trend in the POT water levels, it indicates that the GPD parameters are time independent. These parameters can be estimated using several approaches, such as the method of moments and the maximum likelihood method. More details about parameter estimation can be found in Hosking and Wallis (1987).

2.3 Annual exceedance probability

In the POT method, assuming that the occurrence of events follows a Poisson process, the exceedance probability of an event for a given time interval can be estimated using a GPD (e.g., Davison and Smith 1990; Cole 2001; Méndez et al. 2006; Eastoe and Tawn 2010; Katz 2013; Salas et al. 2018; Pan et al. 2022). The probability that the POT water level exceeds a given value during a time interval (\({\text{ti}}\)) is calculated using the total probability rule:

$${P}_{ti}(X>x)=1-\sum_{n=0}^{\infty }P\left(X\le x | n\right)\bullet {p}_{{\text{ti}}}(n)$$
(7)

where \(P\left(X\le x|n\right)\) is the conditional probability that the POT water level (\(X\)) does not exceed a given value (\(x\)) for \(n\) water levels, which can be expressed as

$$P\left(X\le x|n\right)={\left[P(X\le x)\right]}^{n}={\left[{F}_{X}(x)\right]}^{n}$$
(8)

and \({p}_{{\text{ti}}}(n)\) is the probability that \(n\) POT water levels are observed during \({\text{ti}}\)which is a rare event. Analogous to Davison and Smith (1990), Katz et al. (2002), Eastoe and Tawn (2010), Katz (2013), Bezak et al. (2014), Bommier (2014), Jia and Sasani (2021), Baldan et al. (2022), \({p}_{{\text{ti}}}(n)\) is assumed to follow a Poisson distribution

$${p}_{{\text{ti}}}(n)=\frac{{\lambda }^{n}\bullet {\text{exp}}(-\lambda )}{n!}$$
(9)

where \(\lambda\) is the recurrence rate, also known as the average number of over-threshold events. For the Boston and New York City cases, \(\lambda =\) 1.70 and 2.33, respectively. With \(ti\) defined as one year, the annual exceedance probability of \(X\) larger than \(x\) can be calculated as

$${P}_{A}\left(X>x\right)=1-\sum_{n=0}^{\infty }\left\{{\left[{F}_{X}(x)\right]}^{n}\bullet \frac{{\lambda }^{n}\bullet {\text{exp}}(-\lambda )}{n!}\right\}$$
(10)

Using Maclaurin series (Zwillinger 2012), Eq. (10) can be simplified as

$${P}_{A}\left(X>x\right)=1-{\text{exp}}\left\{-\lambda \bullet \left[1-{F}_{X}(x)\right]\right\}$$
(11)

2.4 Return periods

The concept of RP is widely used to describe the occurrence of natural hazards, such as flood, wind, and earthquake. In this paper, RP is used to develop flood hazard curves under ST and NS conditions. Under the assumption of stationarity, RPs can be computed easily because the time to the next exceedance of a particular event follows a geometric distribution with a constant success probability. The RP is simply the inverse of the exceedance probability of the event of interest (Read and Vogel 2015). Under nonstationarity, however, the probabilistic behavior of RPs is much more complicated because the RP no longer follows a geometric distribution with a constant success probability. In fact, with even slightly NS behavior, the probability distribution of an RP looks nothing like a geometric distribution with a constant success probability, as shown by Read and Vogel (2015). Computing RP in an NS context requires a prediction of flood events for the selected future time span along with a careful uncertainty analysis.

Within both ST and NS contexts, an RP can be interpreted as 1) the expected waiting time until an exceedance event occurs, denoted as \({RP}_{1}\) (Olsen et al. 1998; Cooley 2013; Cheng et al. 2014; Salas and Obeysekera 2014; Obeysekera and Salas 2014, 2016; Read and Vogel 2015; Salas et al. 2018; Naseri and Hummel 2022) or 2) the time associated with the expected number of exceedance events equal to one, denoted as \({RP}_{2}\) (Olsen et al. 1998; Parey et al. 2007, 2010; Cooley 2013; Obeysekera and Salas 2014, 2016; Read and Vogel 2015; Salas et al. 2018). The interpretation and computation of RP in the ST and NS contexts are elaborated below.

2.4.1 Return periods under stationarity

In an ST context, the sea level is assumed not to change in the future. \({{\text{RP}}}_{1}\) for a given water level follows a geometric distribution with a probability mass function of

$${f}_{{{\text{RP}}}_{1}}\left({{\text{rp}}}_{1}\right)={\left[{1-P}_{A}({X}_{T}>{x}_{t})\right]}^{{{\text{rp}}}_{1}-1}\bullet {P}_{A}({X}_{T}>{x}_{t})$$
(12)

where \({P}_{A}({X}_{T}>{x}_{t})\) is the annual exceedance probability that water level (\({X}_{T}\)) exceeds a given value (\({x}_{t}\)). \({X}_{T}\) is equal to the POT water level (\(X\), which is detrended) in relation to a sea level at a time of interest. Since sea level is assumed not to change in the future, \({P}_{A}({X}_{T}>{x}_{t})\) is consistent and equal to \({P}_{A}(X>x)\). \({RP}_{1}\) can be calculated as

$${{\text{RP}}}_{1}={\sum }_{{{\text{rp}}}_{1}=1}^{\infty }{{\text{rp}}}_{1}\bullet {f}_{{{\text{RP}}}_{1}}({{\text{rp}}}_{1})=\frac{1}{{P}_{A}({X}_{T}>{x}_{t})}$$
(13)

For \({{\text{RP}}}_{2}\), the number of exceedance events (\(E\)) follows a binomial distribution with probability mass function given by

$${f}_{E}\left(e\right)=\left(\begin{array}{c}{{\text{RP}}}_{2}\\ e\end{array}\right)\bullet {\left[{P}_{A}({X}_{T}>{x}_{t})\right]}^{e}\bullet {\left[1-{P}_{A}({X}_{T}>{x}_{t})\right]}^{{{\text{RP}}}_{2}-e}$$
(14)

The expected value of \(E\) is

$${\text{E}}\left[E\right]={\sum }_{e=1}^{\infty }e\bullet {f}_{E}\left(e\right)={{\text{RP}}}_{2}\bullet {P}_{A}({X}_{T}>{x}_{t})$$
(15)

Setting the expected value of \(E\) equal to 1, \({{\text{RP}}}_{2}\) can be calculated as

$${{\text{RP}}}_{2}=\frac{1}{{P}_{A}({X}_{T}>{x}_{t})}$$
(16)

Note, that the two interpretations of RP are identical in the ST context (see Eqs. (13 and 16)).

2.4.2 Return periods under nonstationarity

Under NS conditions, the sea level is assumed to be NS in the future. The probability mass function of \({{\text{RP}}}_{1}\) in an NS context can be written as

$${f}_{{{\text{RP}}}_{1}}\left({{\text{rp}}}_{1}\right)=\left\{{\prod }_{i=1}^{{{\text{rp}}}_{1}-1}\left[{1-P}_{A}\left({X}_{T}>{x}_{t} \right| i)\right]\right\}\bullet {P}_{A}\left({X}_{T}>{x}_{t} \right| {{\text{rp}}}_{1})$$
(17)

where \({P}_{A}\left({X}_{T}>{x}_{t} \right| i)\) and \({P}_{A}\left({X}_{T}>{x}_{t} \right| {{\text{rp}}}_{1})\) are the annual exceedance probabilities that \({X}_{T}\) exceeds \({x}_{t}\) in year \(i\) and\({{\text{rp}}}_{1}\), respectively. This can be calculated as \({P}_{A}(X>x)\) in relation to the corresponding sea levels in year \(i\) and\({rp}_{1}\), respectively. In other words, \({P}_{A}\left({X}_{T}>{x}_{t}\right)\) is time-dependent. Analogous to Olsen et al. (1998), Cooley (2013), Cheng et al. (2014), Salas and Obeysekera (2014), Obeysekera and Salas (2014), Read and Vogel (2015), Obeysekera and Salas (2016), Salas et al. (2018),), and Naseri and Hummel (2022), Eq. (17) can be simplified as

$${{\text{RP}}}_{1}={\sum }_{{{\text{rp}}}_{1}=1}^{\infty }{{\text{rp}}}_{1}\bullet {f}_{{{\text{RP}}}_{1}}\left({{\text{rp}}}_{1}\right)=1+{\sum }_{{{\text{rp}}}_{1}=1}^{\infty }{\prod }_{i=1}^{{{\text{rp}}}_{1}}\left[{1-P}_{A}\left({X}_{T}>{x}_{t} \right| i)\right]$$
(18)

For \({{\text{RP}}}_{2}\), \(E\) is expressed as

$$E={\sum }_{i=1}^{{{\text{RP}}}_{2}}I\left({X}_{T}>{x}_{t} \right| i)$$
(19)

where \(I\) is the indicator function. \(I\) is equal to 1 when \({X}_{T}\) exceeds \({x}_{t}\) in year \(i\). Conversely, \(I\) is equal to 0 if \({X}_{T}\) is less than or equal to \({x}_{t}\) in year \(i\). According to Cooley (2013), the expected value of \(E\) can be calculated as

$${\text{E}}\left[E\right]={\sum }_{i=1}^{{{\text{RP}}}_{2}}{\text{E}}\left[I\left({X}_{T}>{x}_{t} \right| i)\right]={\sum }_{i=1}^{{{\text{RP}}}_{2}}{P}_{A}\left({X}_{T}>{x}_{t} \right| i)$$
(20)

Setting the expected value of \(E\) equal to 1, \({{\text{RP}}}_{2}\) in an NS context can be calculated numerically by solving

$$1={\sum }_{i=1}^{{{\text{RP}}}_{2}}{P}_{A}\left({X}_{T}>{x}_{t} \right| i)$$
(21)

Note, that if there is an NS trend in the POT water levels, \({F}_{X}(x)\) and, in turn, \({P}_{A}\left(X>x\right)\) are not time independent. As such, two sources of nonstationarity (i.e., NS trends in both sea levels and POT water levels) need to be incorporated into the calculation of \({P}_{A}\left({X}_{T}>{x}_{t} \right| i)\).

2.5 Sea level trend projection incorporation

As discussed above, computing RP in an NS context requires a prediction of flood events for the selected future time span along with a careful uncertainty analysis. From a statistical standpoint, two characteristics of flood events in the future need to be predicted–probabilistic behavior of the POT water levels discussed in Sect. 2.2.2 (Generalized Pareto distribution) and sea level trend. Considering the future sea level trend has been well studied, in this paper, the existing sea level trend projections with uncertainty (e.g., Yin et al. 2009; Kopp et al. 2014; Sweet et al. 2017, 2022) are incorporated in the RP calculation under NS condition.

3 Results and discussions

The proposed methodology for NS coastal flood frequency analysis is applied to two case studies using the NOAA recorded water level data near Boston, Massachusetts, and New York City, New York (NOAA 2023). The NOAA station locations in Boston Harbor, Massachusetts, and New York Battery Bay, New York are shown in Fig. 1. The stations’ information is shown in Table 1. For Boston, monthly mean sea levels (for SLR estimation) and hourly water levels (for POT analysis of detrended water levels) have the same record range. However, for New York City, the records for monthly mean sea levels begin in 1856 but are incomplete between 1879 and 1892. The records for hourly water level for New York City start in 1927. To be consistent with the frequency analysis for the Boston case, the SLR for the New York City case is estimated using the monthly mean sea levels from 1927 to 2022. All the water level data presented in this paper uses the North American Vertical Datum of 1988 as a benchmark.

Fig. 1
figure 1

© Esri — Source: Esri, i-cubed, USDA, USGS, AEX, GeoEye, Getmapping, Aerogrid, IGN, IGP, UPR-EGP, and the GIS User Community)

NOAA water level station locations near a Boston and b New York City. (Map data from Esri | USGS, NOAA, Tiles

Table 1 NOAA water level station information

3.1 Sea level rise

The estimated \({\beta }_{1}\), i.e., the slope in Eq. (1), and the statistics of the estimated SLR trends for Boston and New York City obtained via the ordinary least squares method are shown in Table 2. In both cases, the \(p\)-value for the estimated \({\beta }_{1}\) is less than the 5% significance level. Accordingly, the null hypothesis that there is an ST trend in sea levels is rejected. \(\varepsilon ^{{\prime }}\) is normal, homoscedastic, and independent (see the regression residual analysis in Appendix 5.1). The estimated sea levels and corresponding 95% prediction intervals are shown in Fig. 2.

Table 2 \({\beta }_{1}\) and the statistics of the estimated SLR trends
Fig. 2
figure 2

Estimated sea levels and corresponding 95% prediction intervals developed using ordinary least squares estimators. a Boston and b New York City

3.2 Probability behavior of the peak-over-threshold water levels

For the Boston and New York City cases, the mean excess plots are shown in Fig. 3. According to Davison and Smith (1990), large fluctuations observed at high threshold values are the result of sampling variation in the data extremes and are not significant in determining the threshold (i.e., threshold values higher than 2.38 and 1.63 m for the Boston and New York City cases, respectively). In Fig. 3, the linearized portion of the mean excess plots (preceding the large fluctuations) are generated using the method outlined in Joyner et al. (2022). Similar to Davison and Smith (1990) and Ramzi et al. (2017), the thresholds are selected to equal the starting point of the last linear segment (i.e., \(u\) = 2.04 m and 1.19 m for the Boston and New York City cases, respectively). The developed correlograms (Fig. 11) in Appendix 5.2 illustrate that the selected POT water levels meet the independence condition.

Fig. 3
figure 3

Mean excess plots a Boston and b New York City

In addition to the mean excess plot, the average number of over-threshold events can provide guidance for threshold selection (Lang et al. 1999). For the Boston and New York City cases, the average numbers of over-threshold events (using \(u\) = 2.04 m and 1.19 m) are 1.70 and 2.33, respectively. These average numbers of over-threshold events are comparable with those estimated by Nadal-Caraballo et al. (2016) – 1.50 and 2.00. The difference is due primarily to differing independence conditions (three days between peaks vs. two days between the ends of one storm to the start of the next used by Nadal-Caraballo et al. 2016) and the use of different threshold selection methods (mean excess plot vs. the quantile–quantile optimization method used by Nadal-Caraballo et al. 2016). Using Eqs. (11 and 13), the RPs resulting from the selected threshold and corresponding average numbers of over-threshold events under ST condition are 1.22- and 1.12-year for the Boston and New York City cases, respectively, which are comparable to the RPs discussed in Lang et al. (1999)–1.15-year and 1.20- to 2.00-year.

To evaluate, whether there is an NS trend in the POT water levels, which can affect the following probability distribution fitting, a linear regression is fitted to the POT water levels. Figure 4b shows an outlier for the New York City case resulting from Hurricane Sandy in 2012. To mitigate the effect of outliers, Theil-Sen (Theil 1950) estimators – which, in contrast with ordinary least squares estimators, are not overly sensitive to outliers (Helsel et al. 2020)–are used to fit the linear trend in the POT water levels. For both the Boston and New York City cases, the \(p\)-values for the estimated \({\beta }_{1}\), i.e., the slope in Eq. (1), are higher than the 5% significance level (see Fig. 4). Therefore, the ST trend in the POT water levels is not rejected.

Fig. 4
figure 4

Linear trends of the POT water levels developed using Theil-Sen estimators. a Boston and b New York City

A GPD is used to model the probabilistic behavior of the POT water levels. Since, the ST trend in the POT water levels is not rejected, it is reasonable to assume that the GPD parameters are time independent. These parameters can be estimated using several approaches, such as the method of moments and the maximum likelihood method. More details about parameter estimation can be found in Hosking and Wallis (1987). For both the Boston and New York City cases, the method of moments fits the POT water levels better than the maximum likelihood method does. As such, the method of moments is used in this paper. The quantile–quantile plots between the POT water levels and those estimated via the method of moments are shown in Fig. 5 using the Gringorten plotting position (Gringorten 1963) as recommended by Kim et al. (2008). The Pearson correlation coefficient (\(\rho\)) between the POT water levels and the estimated POT water levels is used as a measurement for goodness-of-fit (see Fig. 5).

Fig. 5
figure 5

Quantile–quantile plots for the detrended POT water levels. a Boston and b New York City

3.3 Sea level trend projection with uncertainty

Since, there is an ST trend in the POT water levels, it is reasonable to assume that the probabilistic behavior of the POT water levels in the future remains constant (see Fig. 4). Thus, only one characteristic of flood events in the future–sea level trend projection – needs to be estimated. Three sea level trend projections between 2000 and 2200 estimated by Kopp et al. (2014) are used for flood event prediction in this paper. These three projections are developed for low, intermediate, and high greenhouse gas emission scenarios, corresponding to representative concentration pathways (RCPs) of 2.6, 4.5, and 8.5, respectively. RCPs 2.6, 4.5, and 8.5 describe three twenty-first century pathways of greenhouse gas emissions and atmospheric concentrations, air pollutant emissions, and land use (IPCC 2014). Note that the sea level trend projections estimated by Kopp et al. (2014) are presented as the sea level changes in 2030, 2050, 2100, 2150, and 2200 with respect to the sea level in 2000. In this paper, the sea level in 2000 is obtained by the SLR trend estimated using Eq. (1). As a user-defined parameter, the future time span is selected to be the current year through 2200, which is consistent with Kopp et al. (2014). Figure 6 shows these sea level trend projections.

Fig. 6
figure 6

Sea level trend projections with uncertainty estimated by Kopp et al. (2014) for three greenhouse gas emission scenarios (shown as RCP) with corresponding 2.5th–97.5th percentile intervals for Boston (left column) and New York City (right column); a, b RCP 2.6, c, d RCP 4.5, and e, f RCP 8.5

Regardless of the method used to project sea level trends, the projection always exhibits uncertainty. For sea level trend projections estimated by Kopp et al. (2014), the uncertainty is presented as the estimation’s 2.5th–97.5th percentile, calculated using local sea level probability distributions. As mentioned in Kopp et al. (2014), local sea level probability distributions are calculated using samples from time-dependent probability distributions of cumulative contributions to sea level rise for each of the individual components (i.e., ice sheet components, glacier and ice cap, global mean thermal expansion and regional ocean steric and dynamic effects, land water storage, and long-term, local, nonclimatic sea level change due to factors such as glacial isostatic adjustment, sediment compaction, and tectonics). The uncertainty of sea level trend projections is shown in Fig. 6 as shaded areas.

3.4 Flood hazard curves

In this section, two RPs are used to develop flood hazard curves under different sea level trend projections with uncertainty. The developed flood hazard curves can be used to define the design flood level, estimate the performance of structures under coastal flooding, and assess community resilience.

3.4.1 A retrospective comparison with other studies and historical flood events

Figure 7 shows the flood hazard curves developed under projected sea level trends for three greenhouse gas emission scenarios estimated by Kopp et al. (2014). For comparison purposes, the flood hazard curves under ST conditions and those developed by extrapolating the estimated SLR trend (see Fig. 2) are also shown in Fig. 7. The flood hazard curves in this paper, which can support conclusions about future RPs, are compared with other studies that report either the 100-year water level or the RP corresponding to a particular event. Under ST conditions, the 100-year water levels reported by other studies are shown in Fig. 7 to be compared with the developed ST flood hazard curves. Under NS conditions, however, no direct comparison is available. Because there are no previous studies incorporating RPs in an NS context to perform frequency analysis for Boston and New York City.

Fig. 7
figure 7

Flood hazard curves developed under different sea level trend projections (stationary, extrapolating the estimated SLR trend, and three greenhouse gas emission scenarios estimated by Kopp et al. (2014)) compared with other studies and historical flood events for Boston (left column) and New York City (right column) interpreted using a, b \(R{P}_{1}\) and c, d \(R{P}_{2}\)

For the Boston case, the 100-year water levels estimated by FEMA (2016) and Nadal-Caraballo et al. (2016) are slightly lower than the ones estimated by the ST flood hazard curves in this paper (see Fig. 7a and c). This is primarily because (1) FEMA (2016) used historical water level data for frequency analysis without considering sea level trend, and (2) Nadal-Caraballo et al. (2016) used different water level data (hourly water levels and monthly maximum water levels) for sea level trend estimation. As compared with Nadal-Caraballo et al. (2015), the 100-year water level estimated by the ST flood hazard curve is lower, with a relative difference of 3% (see Fig. 7a and c). This is because Nadal-Caraballo et al. (2015) used water level data derived from thousands of simulated storms obtained by storm surge models instead of using historical water level data. As expected, in comparison with thousands of simulated storms, the historical record is relatively minimal and includes fewer extreme storm occurrences, resulting in limited water level data for hazard analysis.

For the New York City case, a result similar to that of the Boston case is observed (see Fig. 7b and d). The 100-year water level estimated by Nadal-Caraballo et al. (2016), which used historical water level data, is lower than the one derived using the ST flood hazard curve. Conversely, the 100-year water levels estimated using data from thousands of simulated storms (FEMA 2014; Nadal-Caraballo et al. 2015) and hundreds of simulated storms (FEMA 2013) are higher than the ones derived using the ST flood hazard curve. The corresponding relative differences vary from 29 to 45%, which are much larger than the 3% relative difference resulting from Nadal-Caraballo et al. (2015) for the Boston case. The main reason for this discrepancy is the different geographical locations. Compared with Boston, New York City has a lower latitude (see Table 1), which indicates that New York City is more easily hit by tropical cyclones (e.g., Hurricane Sandy 2012). Tropical cyclones tend to produce higher surge than extratropical cyclones (e.g., Blizzard of’78 1978) since they typically exhibit higher intensity (Nadal-Caraballo et al. 2015). Considering that most simulated storms developed by FEMA (2013), FEMA (2014), and Nadal-Caraballo et al. (2015) are tropical cyclones, the resulting water levels could have more extreme values, compared with the Boston case. Also, the historical record has limited water level data and relatively sparse extreme storm occurrences. Therefore, the 100-year water levels estimated by FEMA (2013), FEMA (2014), and Nadal-Caraballo et al. (2015) are much higher than the one derived using the ST flood hazard curve and close to the water levels from NS flood hazard curves.

The historical flood events that resulted in the highest water levels–Blizzard of’78 (1978) for Boston and Hurricane Sandy (2012) for New York City–are shown in Fig. 7. For ease of comparison, the RPs for these two events obtained by the flood hazard curves under different sea level trend projections are shown in Table 3. For both the Boston and New York City cases, the RPs decrease as the slope of the sea level trend projections increases (see Fig. 6). It can be concluded that the use of this paper’s approach to NS frequency analysis can lead to considerably lower estimations of historical flood event RPs.

Table 3 RPs for historical events estimated by the flood hazard curves developed using different sea level trend projections

The highest water levels are outliers that affect the trend of POT water levels, the probability distribution of POT water levels, and, in turn, the flood hazard curves (see Figs. 4, 5, and 7). If Hurricane Sandy had hit Boston at high tide, which is 5.5 h earlier than it did, a 3-m storm surge could have been recorded (CBS New Boston 2012), and more than 6% of Boston would have been underwater (Loth 2013; Friedman et al. 2019). This would have resulted in a water level higher than the one caused by the Blizzard of’78 and would have affected the behavior of the flood hazard curve developed here. As such, if a Hurricane Sandy-level event were to hit Boston at high tide, the flood hazard curves shown in Fig. 7 would likely be underestimated. Considering the time interval between high tide and storm surge can affect water level, an alternative NS frequency analysis can be performed focusing on storm surges, as opposite to the water levels discussed in this paper. The storm surges can be obtained by removing the astronomical tide heights from the historical water levels. The probability distribution of detrended storm surges can be modeled using the proposed methodology and incorporated with different tidal levels and sea level trend projections. Compared with the flood hazard curves shown in Fig. 7, the resulting flood hazard curves can account for the effect due to storm landfall time.

In terms of the different interpretations of RP, both \({RP}_{1}\) and \({RP}_{2}\) are used for developing flood hazard curves in an NS context (Olsen et al. 1998; Cooley 2013; Obeysekera and Salas 2014; Read and Vogel 2015). As shown in Table 3, \({{\text{RP}}}_{2}\) is larger than \({{\text{RP}}}_{1}\) for a given water level. In other words, the water level estimated using \({{\text{RP}}}_{1}\) is higher than the one estimated using \({{\text{RP}}}_{2}\). Similar observations are reported by Olsen et al. (1998) and Read and Vogel (2015). While, the \({{\text{RP}}}_{1}\) value results in a more conservative water level estimation, it requires that the trends to be indefinitely extrapolated. The \({{\text{RP}}}_{2}\) calculation, however, requires extrapolation only over the \({{\text{RP}}}_{2}\) years, and therefore it is considered a more suitable choice.

3.4.2 Uncertainty of flood hazard curves

The uncertainty of flood hazard curves arises from two sources: probability distributions of POT water levels and sea level trend projections. As discussed above, it is reasonable to assume that the probability distribution of future POT water levels is constant. Therefore, the uncertainty arising from the probability distributions of POT water levels is not included in this paper. Only the uncertainty arising from sea level trend projections is considered (see Fig. 6). The uncertainty of the flood hazard curves is quantified by the same statistical interval as the uncertainty of the sea level trend projections and is shown in Figs. 8 and 9 as shaded areas.

Fig. 8
figure 8

Flood hazard curves with uncertainty interpreted using \(R{P}_{1}\) under sea level trend projections (shown as RCP) estimated by Kopp et al. (2014) for Boston (left column) and New York City (right column) a, b RCP 2.6, c, d RCP 4.5, and e, f RCP 8.5

Fig. 9
figure 9

Flood hazard curves with uncertainty interpreted using \(R{P}_{2}\) under sea level trend projections (shown as RCP) estimated by Kopp et al. (2014) for Boston (left column) and New York City (right column) a, b RCP 2.6, c, d RCP 4.5, and e, f RCP 8.5

4 Conclusions

A straightforward approach for NS coastal flood frequency analysis is presented to address two important issues: (1) sea level trend projections that do not incorporate in frequency analysis and (2) the impact of NS behavior on the widely used RP concept. This NS coastal flood hazard analysis can be conducted in four main steps: (1) estimating SLR to detrend water level data, (2) fitting probability distributions to POT water levels, (3) incorporating the existing sea level trend projections in RP calculations, and (4) describing the NS behavior of RPs (\(R{P}_{1}\) and \(R{P}_{2}\)). The final result is flood hazard curves with uncertainty that describe the probabilistic behavior of future coastal water levels.

Two case studies are conducted, one for Boston and one for New York City. The results demonstrate that, compared with frequency analysis under ST conditions, the use of this NS coastal flood frequency analysis can lead to considerably lower estimates of the RPs posed by historical flood events. The comparison between the two interpretations of RPs shows that, for a given water level, \({{\text{RP}}}_{2}\) is larger than \({{\text{RP}}}_{1}\).

Notable advantages of this NS coastal flood frequency analysis include.

  1. (1)

    As opposed to extrapolating the estimated SLR trend, which is based on a purely statistical approach, sea level trend projections that model various future climate change scenarios with physical bases (e.g., greenhouse gas emissions scenarios in Kopp et al. 2014) are used in the development of flood hazard curves. This can provide users with a better understanding of the probabilistic behavior of future coastal flood events under climate change.

  2. (2)

    Two interpretations of RP (\({{\text{RP}}}_{1}\) and \({{\text{RP}}}_{2}\)) are evaluated in an NS context. Both \({{\text{RP}}}_{1}\) and \({{\text{RP}}}_{2}\) are well suited to the development of flood hazard curves. Normally, flood hazard curves do not include RPs in an NS context and, thus, are representative of only a particular moment in time. In contrast, engineers can use the flood hazard curves presented in this paper in conjunction with a planning horizon for coastal infrastructure design, evaluation of infrastructure performance, and community resilience assessment.

  3. (3)

    This NS coastal flood frequency analysis is a user-friendly approach for engineers because advanced training in statistics is not needed to perform it. This approach can be applied to any city with sufficient historical water level data and extended to other hazard measures (e.g., streamflow, precipitation).

  4. (4)

    The improvements derived from this NS coastal flood frequency analysis include a method to account for the uncertainty of probability distributions of POT water levels and the estimation of sea level trend projections that combine historical water levels with various climate change scenarios.