Abstract
This study first validates the ASGS algorithm developed in part I with an analytical solution in a simplified dynamical system and with a real storm surge event. It then assesses the computational efficiency by the ASGF method compared to the traditional method. By analyzing a realistic case, the ASGF method is shown to be three orders of magnitude more computationally efficient than the traditional method. Using the singular value decomposition (SVD) and the fast Fourier transform and its inverse (FFT/IFFT), this study further demonstrates how to compress atmospheric forcing data and how to cast the ASGF convolution as a simple and efficient regression model for data assimilation. When tested with the real storm surge event, the output from the regression model can account for 98 % of the observed variance.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In part I of this study, Xu (2015) proposed a new method of modeling storm surges called the ASGF method, which uses the all-source Green’s function (ASGF) as its core technique. The ASGF is a matrix pre-calculated from a numerical storm surge model. Each column of the ASGF matrix is a Green’s function that corresponds to an impulse at a grid point; there are as many such columns as the total number of model grid points. As shown in Eq. (45) of Appendix 3 in part I (copied here with the subscripts and tilde sign omitted), a complex storm surge model can be described as a simple convolution:
where G is an ASGF matrix prepared for a point of interest (POI), f is a time-variant globally distributed atmospheric forcing vector, and η is the solution to the time series of the sea surface elevation at the POI. Part I also interpreted the meaning of the ASGF from different perspectives. It also introduced the memory time scale of the ocean to remember past storm surges and the sampling rate for the ASGF (cf. Sections 3.2 and 3.3 in part I).
This paper is the second part of this study and will first test the ASGF convolution with a simple case and then with a realistic case. It will then assess the computational efficiency of the ASGF method compared to the traditional method. Then, a regression model based on the ASGF convolution will be developed for data assimilation. The simple case test will be performed in a non-rotational, rectangular basin with a flat bottom. The analytical solution to the wave motions in such a basin is obtainable and can be used to verify the ASGF convolution. Section 2.1 will focus on this simple case test. The realistic case test will be presented in Section 2.2, where real storm surges observed at Sept-Îles in Quebec, Canada, in the Gulf of St. Lawrence in December 2010 will be used to test the performance of the ASGF convolution. The analysis of the computational efficiency of the ASGF method will be presented in Section 3.
To obtain a regression model, this study uses the singular value decomposition (SVD) to decompose the ASGF convolution matrix. The application of SVD provides two benefits: the atmospheric forcing field can be significantly compressed, which in turn can facilitate data storage and speed up data retrieval and computations, and the singular values from the SVD can be used as the regression parameters to best fit the simulations to the observations. These points are described in detail in Sections 4 and 5. This study will also demonstrate how the fast Fourier transform and its inverse (FFT/IFFT) can be used to speed up the ASGF convolution. Three appendixes provide complementary details and a MATLAB function that is used to compute the ASGF convolution.
2 Testing the ASGF algorithm
This section tests the ASGF algorithm in two ways: validating the algorithm with a simple problem for which an analytical solution is available as a standard for comparison, and testing the ASGF algorithm with a real storm surge event to see how well the predictions by the algorithm agree with the observations. As we will see soon, the algorithm passes both tests very well.
2.1 Test in a simplified dynamical system
This subsection validates the ADI and ASGF numerical approaches with an analytical solution in a rectangular, non-rotational, and flat-bottom basin. Appendix 1 details how to obtain the analytical solution driven by an arbitrary atmospheric forcing field and by a constant west wind. Equation (A34) in the appendix shows how the sea surface evolves under the constant and uniform west wind; it is used in this study for comparison with the numerical solution.
To obtain the numerical solution to the same test problem, the Arakawa C-grid (Arakawa and Lamb 1977) is placed over the rectangular basin: let a = b = 100 km be the sizes of the rectangular basin, and let Δx = Δy = 1 km be the grid spacing. The abovementioned constant and uniformly wind stresses are applied to the velocity points of the grid. Additionally, the real water depth at Sept-Îles, h = 41 m, is set to be the depth of the basin. The friction parameter is set to κ = 0.0028 m/s. (cf. Eqs. (A1) to (A5)).
The above specifications completely determine the numerical solution. Snapshots of the solution at four times are presented in the left panels of Fig. 1, together with their analytical counterparts on the right panels. The numerical solution was obtained using Eq. (10) in part I with the ADI scheme, where a 10-sec time step was used. The maximum domain-wise absolute differences of the two solutions are noted in the left panels. As shown, the numerical solution is very close to the analytical solution, differing by approximately 1 mm, which is approximately 1 % of the relative error. The top three panels on the left and right both show how the constant west wind results in the sea level dropping down at the west coast and piling up at the east coast within the first 30 min. Both the left and right bottom panels show that the sea level appears to be a plane with a constant slope over 1 day (i.e., 1440 min). The constant slope was developed through a series of decaying waves, which are clearly shown in the next figure.
The ASGFs used in this study are derived from Eq. (10), the correctness of which hence needs to be verified first. The comparison shown in Fig. 1 gives a satisfactory test result. However, it would not be adequate if this concluded the comparison because there is a significant amount of algebra involved from Eq. (10) to the ASGF convolution. In the comparisons shown in Figs. 2 and 3 below, the numerical solutions are all obtained using the ASGF convolution, Eq. (1). Figure 2 consists of four panels, where panels A and B show the time series of the sea levels near the east and west coasts (i.e., half grid size away from the coasts) for the first 12 hours, and panels C and D show the same time series but over 72 hours, which all demonstrate the closeness between the numerical solution in green and the analytical solution in red (the red is almost completely masked by the green). The maximum of the absolute and relative errors of the numerical solutions for the entire length of the time series is shown on the panels. The maximum absolute error is 4.23 mm, and the relative error is 1 %. Figure 3 shows the sea levels after 3 days from the west to the east coast: The straight line in red is the analytical solution given by Eq. (A36), and the two green “+” symbols are the outputs of the ASGF convolution. As shown, the two green “+” symbols lie perfectly on the theoretical line. The value of the sea-surface slope calculated based on the two green
“+” symbols and the value of the theoretical asymptotic slope are denoted as “slope” and τ/gh, respectively, in the panel. The two values are shown to be very close, with a difference of 7.7562 × 10− 10.
2.2 Test with a real storm surge event
The ASGF algorithm is now tested with a real storm surge case. Between the 6th and 7th of December 2010, a large storm moved over the Estuary and Gulf of St. Lawrence. Figure 4a shows a snapshot of the air pressures at the mean sea level based on the Modern-Era Retrospective Analysis for Research and Applications (MERRA) data. The storm caused a large surge in the sea level that damaged coastal highways and many residential properties (Fig. 4b, c). For this test, the tidal gauge station at Sept-Îles, which is operated by the Canadian Hydrographic Service, is used as the POI (Fig. 4d). The ASGF matrix G for this POI has been calculated and is shown in part I (see Section 4.1 and Fig. 3 in part I). The MERRA dataset is used to supply the atmospheric forcing field. This dataset was produced by NASA and is available at http://gmao.gsfc.nasa.gov/merra/. It is a re-analyzed dataset that uses NASA’s global data assimilation system and a variety of global observing systems and is meant to provide the science and applications communities with a state-of-the-art global dataset. Its temporal resolution is hourly, its spatial resolution is 0.50 degrees of latitude and 0.67 degrees of longitude, and it covers a time period from 1979 to the present. This dataset is adopted in this study because of its highly realistic solutions, its fine temporal resolution, and its global coverage. For any hour, the MERRA dataset provides a forcing vector of 408,622 elements consisting of the points of air pressures and wind stresses in the global ocean.
Equation (1) should be computed through the fast Fourier transform (FFT) and its inverse (IFFT) because the computational efficiency will be significantly enhanced thereby. Computing a convolution through FFT/IFFT is known as the convolution theorem or the convolution rule (Strang 1986). Equation (B16) in Appendix 2.B4 shows how the theorem can be applied to matrices; Appendix 2.B5 shows how the theorem is implemented in a MATLAB function, conv_FFT. Figure 5 compares the observation and the simulation; the red curve is the simulation computed using the function conv_FFT. The figure shows that the observed surge that peaked at 0 hours on 7 December is well captured by the simulation. The overall agreement between the observation and simulation for the entire simulation period (441 hours) is also good. γ 2, which is defined below, quantitatively measures the overall misfit:
A smaller value of this ratio indicates a better agreement. The value of γ 2 shown in the title of the figure is 0.18, which means that 82 % of the observed variance is accounted for by the simulation. The agreement is very satisfactory, considering that no data assimilation technique has been applied yet. Section 5 will introduce a data assimilation technique, with which the value of γ 2 will be further reduced to 0.02. The effects of global forcing and the global ocean geometry have been accounted for in the simulation by the ASGF.
3 Analysis of the computational efficiency gain for the forced wave problem
This section analyzes the computational efficiency gained by the ASGF method compared to the traditional method. The number of multiplications involved in each of the methods to complete a given simulation task will be used to indicate their computational efficiency; the method that involves fewer multiplications is a more efficient method. To use the number of multiplications as a measure of computational efficiency is a standard approach; for example, the efficiency of the FFT algorithm is assessed by considering how many fewer multiplications it requires than the discrete Fourier transform (DFT) algorithm. Addition and subtraction operations are not counted because they are far less computationally expensive than multiplications. A division can be treated as a multiplication of the inverse of the divisor.
Using " # * " trad and " # * " asgf to denote the number of multiplications required by the traditional and ASGF methods respectively, we can define the following measure to quantify the computational gain by the ASGF method:
When gain > 0 or gain < 0, the ASGF method is more or less efficient, respectively, than the traditional method. When gain = 0, the efficiencies of the two methods are equal.
As noted at the end of Section 2 in part I, it is appropriate to view Eq. (10) of part I as a representative of the traditional method. The number of multiplications per time step required by Eq. (10) of part I can be generally expressed as N × m, where N indicates the number the elements of the state vector and m indicates the number of multiplications needed to update each of the elements; the latter value depends on the actual difference scheme involved in Eq. (10) of part I. Appendix 5 in part I analyzes the ADI scheme (Leendertse 1967) and finds that m = 12. Although not shown in Appendix 5 in part I, it could be easily found that m = 3 for Sielecki’s (1968) explicit-implicit (EI) scheme. The ADI method requires a higher value of m than the EI scheme; however, the former is unconditionally stable, whereas the latter is only conditionally stable. In the real case shown later, the time step with the ADI scheme can be 60 sec, whereas the time step with the EI scheme can only be 5 sec because it is constrained by the CFL condition. The total number of multiplications involved in a simulation is proportional to m and inversely proportional to the time step; thus, the ADI scheme is preferred to the EI scheme in terms of computational efficiency in addition to its nearly perfect conservation of the total energy, as discussed in Appendix 1 of part I.
Using T as the total simulation time and Δt trad as the time step used by the traditional method, the number of time steps is T/Δt trad. The total number of the multiplications involved can then be calculated by:
To perform the same simulation, the ASGF method uses Eq. (B16) in Appendix 2.B3, which uses the FFT/IFFT to quickly perform the ASGF convolution. The number of multiplications required by Eq. (B16) is expressed in Eq. (B20) in the appendix and is copied here as " # * " asgf:
where n indicates the number of columns of the matrix G (which is the same as that of F), Δt smp is the sampling rate used to prepare the matrix G (cf. Section 3.3 of part I), and L opt is an optimal piecewise convolution length, which is introduced and discussed in detail in Appendix 2.B4. Substituting Eqs. (4) and (5) into Eq. (3), we obtain
where the specific numbers pertaining to the realistic case shown in Section 2.2 and consequently p = 2 have been substituted in the second step. Equation (7) shows that the ASGF method is 7.3255 × 103 times more efficient than the traditional method for the realistic case. The gain is three orders of magnitude and is contributed by the following three factors.
The first contributing factor is Δt smp/Δt trad, which is the ratio of the ASGF sampling rate to the time step used by the traditional model. The ASGF sampling rate, Δt smp, can be arbitrarily large and is chosen to be the same as the time resolution of a given atmospheric forcing field, which is 3600 sec for the MERRA forcing field. The choice of the sampling rate does not affect the accuracy of the solution; it only matters how frequently we sample the model solution (cf. Section 3.3 of part I). Conversely, the choice of Δt trad cannot be as arbitrary. Although the ADI scheme may be unconditionally stable, the choice of the time step must also consider the accuracy of the model solution; a larger time step produces a less accurate model solution. For the real storm surge case presented in the previous section, Δt trad = 60 sec was chosen and is already 12 times as large as the CFL condition time step (i.e., 5 sec) that must be used if the difference scheme was the EI scheme instead of the ADI scheme. Thus, the ratio Δt smp/Δt trad contributes a factor of 60 to the gain.
The second contributing factor is a ratio of the number of grid points in the storm surge model to that of the atmospheric model, N/n. In the above example, N = 32, 224, 425 and n = 408, 622; their ratio contributes a factor of 78.86 to the gain. An atmospheric model usually has a coarser grid than does an ocean model. The traditional model must spread the coarse-gridded forcing data onto the more-finely gridded ocean model. The ASGF method does the opposite: it folds the more-finely gridded ocean model to fit to the coarse-gridded atmospheric field (see Appendix 3 in part I for details).
The third contributing factor is m/[(0.5 + 0.5/p + 0.5/n)log2 L opt + 1], of which the numerator is the number of multiplications per time step required by the traditional model, and the denominator is the number of multiplications per time step required by the ASGF convolution via the FFT/IFFT. The ADI scheme makes the numerator m = 12, and the value of the denominator is 7.75 when n = 408, 622 , p = 2, and log2 L opt = 9 (Appendix 2.B3 states that the optimal piecewise convolution length L opt is 29 when L G = 72); thus, the third ratio contributes a factor of 1.5 to the gain.
The gain expressed by Eqs. (6) or (7) is theoretical. We can also obtain an empirical gain by recording how much computer time is required by the ASGF and the traditional methods to perform simulations with the same length. To perform the simulation shown in Fig. 5, the ASGF method required 10.6036 secFootnote 1 using conv_FFT in Appendix 2.B5. To perform the same simulation, the traditional method using the ADI scheme required 9.2210 × 104 sec when Δt = 60 sec and the same grid was used. The ratio of the latter to the former is 8.6961 × 103, which is close to the theoretical gain obtained by Eq. (7). Both the empirical and theoretical gains indicate that the ASGF method is three orders of magnitude faster than the traditional method.
4 SVD compression of the forcing field
The convolution matrix G is generally short and wide. The sizes of G for the real case example used above are 72 × 408622; this large aspect ratio implies that there is a large null space that can be squeezed out for the sake of computational efficiency and data assimilation. The singular value decomposition (SVD, e.g., Strang 2007) is the method to find such a null space. Applying the SVD to the example G matrix results in:
where U and [V V 2] are unitary matrices (i.e., U T U = UU T = I and [V V 2]T[V V 2] = [V V 2][V V 2]T = I, in which each I is an identity matrix with its appropriate dimensions understood), and S is a diagonal matrix with non-negative real numbers on the diagonal, which are known as the singular values of G. Next to S in the middle is a zero matrix, which makes V 2 contribute null to G. Discarding V 2 , the decomposition reduces to
The singular values are arranged in descending order. They regulate the importance of the input modes. Figure 6 shows the singular values of G for Sept-Îles; values larger than one (there are 39 of them) amplify the effects of the corresponding forcing components, whereas those less than one (there are 33 of them) reduce the effects. The figure also indicates a ratio of the last singular value versus the first singular value, which is 0.0019 and means that the significance of mode 72 is less than 0.2 % of that of mode 1. In retrospect, this may justify the choice of 72 hours as the memory time scale for calculating the matrix G.
Substituting Eq. (9) into Eq. (1) results in:
where the associative property of the matrices in convolution is used in the second step (see Eq. (B5) in Appendix 2.B2), and a new forcing vector ψ in the third step is defined as:
The new forcing vector is a compressed forcing vector. The uncompressed forcing vector, f, has 408,622 elements, whereas the compressed one, ψ, has only 72 elements. The compression rate is 5675 times; thereby, the storage and retrieval of the forcing data are greatly facilitated. For example, the global MERRA atmospheric data from 1979 to 2013 require 400 GB of disk space. The SVD compression reduces the size of this dataset to 72 MB per POI. If we have 1000 POIs, which is perhaps a quite large number in practice, the total disk space required to store the compressed forcing data is 70 GB, which can be easily stored on a portable computer.
The SVD compression also increases the computational efficiency. The convolution of η using Eq. (12) is much faster than using Eq. (10) because there is much less to convolute after the compression. The convolution theorem discussed in Appendix 2.B4 can be well applied to Eq. (12) to further increase computational efficiency. Using US and ψ as the first and second input to the function conv_FFT in Appendix 2.B5, the same simulation shown in Fig. 5 would be completed in 0.0018 sec; this would increase the empirical gain, which was discussed at the end of the last section, from three orders of magnitude to seven orders of magnitude.
The gain in computational efficiency due to the SVD compression of the forcing data was not accounted for in favor of the ASGF method in the last section because one could argue that the data compression itself requires time. However, the gain due to the SVD compression should not be overlooked. Actually, the retrieval of the forcing data from a computer’s hard disk takes much more time than does the compression performed within the RAM. Therefore, if 1000 POIs are considered, we can simultaneously compress the same forcing vector f for these POIs once the forcing vector is loaded into the RAM. The time required for the forcing data compression per POI is thus reduced by a factor of 1000. The abovementioned 34 years of hourly MERRA forcing field data were compressed this way for all of the permanent tidal gauges in eastern Canada. The compressions need to be performed only once. Then, simulations of storm surges for any period within the 34 years period for any of the POIs can be performed quickly. For example, it takes only half of a second to compute a 10-year-long hourly simulation. Such a high simulation speed can make simulated storm surges instantaneously available for further related studies. We may find that many of our studies are explorative, requiring repetitive simulations in order to determine a best approach or to draw a conclusion. For example, in data assimilation with weighted least-squares regression, we may have to try many simulations with different weighting schemes. Without an instantaneous simulation capability, such explorative studies would be greatly hindered.
5 From the ASGF convolution to a regression model
From Eq. (12), we have:
which we can transform into:
using the associative property shown in Eq. (B5) in Appendix 2.B2. The following relationship has been proved in Appendix 3:
where S and s are of the same content but are arranged differently, so are Ψ and ψ. Substituting Eq. (16) into Eq. (15), we have:
where C is a matrix given by the column-wise convolution between the matrix U and Ψ:
which can be computed very quickly with the FFT/IFFT. The MATLAB function conv_FFT provided in Appendix B5 can be used to evaluate the C matrix. It takes only one second to evaluate a C matrix with 10 years of hourly forcing field.
By admitting an error term, ε, which may consist of errors from the observations, the forcing data, and imperfections of the model, we can rewrite Eq. (17) as a regression model:
where s should now be viewed as a vector of regression parameters. These parameters originate from the surge model based on the SVD of G; they can now be relaxed from the given values to best fit the observations. The least-squares fitting parameter vector, ŝ, is given by:
where η o is the observed data (i.e., the detided signals). The least-squares fitted solution, \( \widehat{\boldsymbol{\upeta}} \), is given by:
(e.g., Strang 1986, 2007; Seber and Lee 2012).
Section 2.2 shows a simulation of storm surges at Sept-Îles for 441 hours, starting from December 1, 2010. The simulation is performed without data assimilation. The γ 2 value, which measures the misfit between the simulations and the observations (see Eq. (2)), is equal to 0.18. Let us now see how much the regression model described in Eq. (19) can further reduce the misfit. The simulation without data assimilation from Fig. 5 is copied in the top panel of Fig. 7. The simulation with data assimilation using Eqs. (19) to (21) is shown on the bottom panel of the figure. As shown, the data assimilation significantly improves the fitting between the simulation (in red) and the observations (in black). The γ 2 value is reduced to 0.02, which indicates that 98 % of the observed variance is accounted for by the simulation.
6 Summary and discussions
This paper is the second part of this study. Starting with the ASGF convolution of Eq. (1) developed in part I, this paper first validated the ASGF algorithm with an analytical solution obtained from a simplified dynamical system: waves in a non-rotational, rectangular basin with a flat bottom. The validation produced excellent results (Figs. 1, 2 and 3). The ASGF algorithm was then further tested with a real storm surge event that occurred in Sept-Îles during December 2010. The test yielded a very satisfactory result: 82 % of the observed variance was explained by the simulation (Fig. 5), which also indicates that the linear and depth-averaged shallow water dynamics is indeed a good basis for developing the ASGF method.
The gain in the computational efficiency due to the ASGF method compared to the traditional method was theoretically analyzed and empirically estimated. Both the theoretical and empirical gains indicate that the ASGF method is three orders of magnitude faster than the traditional method in producing the same simulation shown in Fig. 5. There are three factors that make the ASGF method perform better than the traditional method. First, to produce an hourly time series, the ASGF method can use an hourly time step without affecting the accuracy of the simulation (cf. the sampling rate discussed in Section 3.2 of part I), whereas the traditional method has to use a much smaller time step for the sake of computational stability or accuracy. Second, the spatial resolution in a given atmospheric forcing field is usually much coarser than the grid spacing of a surge model in the ocean; consequently, the atmospheric forcing vector is much shorter than the surge state vector. The ASGF method takes advantage of the difference in the two spatial resolutions by compressing the columns of its convolution matrix to fit to the shorter forcing vector; conversely, the traditional method must stretch the forcing vector to fit to the longer state vector. Third, the ASGF method can use the FFT/IFFT to speed up the computation, whereas the traditional method cannot.
Another advantage of having cast a traditional storm surge model into an ASGF convolution is that the SVD can be applied to the convolution matrix. Using the SVD, the forcing data can be greatly compressed. The real case example shows that the compression rate is 5675 times; thus, a long forcing vector with 408,622 elements can be replaced by a shorter vector composed of only 72 elements. The compression significantly facilitates the storage and retrieval of the forcing data and makes the convolution performed more rapidly, allowing for long-term storm surge simulations to be finished within seconds.
The most important benefit of the application of SVD is data assimilation. With the SVD, the ASGF convolution, Eq. (1), is cast into a regression model, Eq. (19). The singular values can be moved from the middle position to the right position so that it can be relaxed from being given by the original surge model to the one best adjusted by the observed data. The power of the regression model to fit the simulations to the observations is well demonstrated. Without data assimilation, the overall misfit between the observations and the simulations is 0.18. With the data assimilation, the overall misfit is decreased to 0.02 (cf. Fig. 7).
The main point of this study is that a complicated and time-consuming storm surge model is cast into a simple and efficient regression model, with which we can easily conduct various regression analyses. The data assimilation demonstrated in the last section is not the only way; for example, a weighting matrix could be multiplied to both sides of Eq. (19) to allow for the large data to have more weight in determining the least-squares solutions. Large data are more likely caused by storms than are small data.
Due to its fast simulation speed and data assimilation capability, the ASGF regression model shown in Eq. (19) provides an effective tool for long-term hindcasts and forecasts of storm surges. With past observations and realistic atmospheric forcing data, a hindcast can yield the best estimated model parameter vector \( \widehat{\mathbf{s}} \). With the best estimated \( \widehat{\mathbf{s}} \) and a set of climate model solutions for the future that can be used as the atmospheric forcing, we can climatologically forecast storm surges. Xu et al. (2015) has taken this approach and produced storm surges at Sept-Îles over the next hundred years, which can serve as a database for further statistical analyses such as extreme value analyses.
Notes
All the computations mentioned in the paper are carried out in Dell M 6600, with Intel Core i7-2960XM CPU @ 2.70GHz and 32 GB RAM.
For a solid vertical coast, applying a zero or non-zero horizontal force makes no difference if we are concerned with only water movement. However, the former choice makes the boundary conditions homogenous, which will in turn make the solution much easier to obtain.
Using the divergence theorem of Gauss and the zero fluxes at the coasts, the domain-wise integral of the RHS of (A1) is zero.
References
Arakawa A and Lamb VR (1977). Computational design of the basic dynamical process of the UCLA general circulation model. Methods Comput Phs. 17, Academic Press, 173-265
Courant R, & Hilbert D. (1962) Methods of mathematical physics (vol. ii): partial differential equations. Interscience. New York
Leendertse JJ (1967) Aspects of a computational model for long-period water-wave propagation (p. 165). Rand Corporation, Santa Monica
Myint-U T, Debnath L (2007) Linear partial differential equations for scientists and engineers. Birkhäuser, Boston
Seber GA, and Lee AJ (2012) Linear regression analysis (Vol. 936). John Wiley & Sons
Sielecki A (1968) An energy-conserving difference scheme for the storm surge equations. Mon Wea Rev 96:150–156
Strang G (1986) Introduction to applied mathematics. Wellesley-Cambridge Press, Wellesley
Strang G (2007) Computational science and engineering. Wellesley-Cambridge Press, Wellesley
Xu Z (2015) The all-source Green’s function (ASGF) and its applications to storm surge modeling, part I: from the governing equations to the ASGF convolution. Ocean Dynamics. doi:10.1007/s10236-015-0893-z
Xu Z, J-P Savard and D Lefaivre (2015) Data assimilative hindcast and climatological forecast of storm surges at Sept-Îles in the Estuary of Gulf of St. Lawrence. Atmosphere and Ocean. doi:10.1080/07055900.2015.1079774
Acknowledgments
This study was partially supported by le Ministère des Transports du Québec and by the ACCASP program of Department of Fisheries and Oceans. This study also benefited from collaboration with the OURANOS ConsortiumFootnote 2. Free access to the MERRA data provided by NASA is also significantly appreciated. The author also gratefully acknowledges the support from his own institute.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: Kevin Lamb
This article is part of the Topical Collection on the 6th International Workshop on Modeling the Ocean (IWMO) in Halifax, Nova Scotia, Canada 23-27 June 2014
Appendices
Appendix 1: An analytical solution in a simplified dynamical system
This section verifies the numerical approach presented in this study with an analytical solution in a simplified dynamical system. A rectangular, non-rotational, and flat-bottom basin is considered. This simple setting permits an analytical solution. For this case, the governing equations of Eq. (1) in part I simplifies to:
where x and y are horizontal Cartesian coordinates along the east and north directions, respectively, and the other notations are the same as those used in Eq. (1) of part I. The lateral boundary conditions are zero fluxes at x = (0, a) and y = (0, b), and the initial values for η, U and V are all assumed to be zero.
We can eliminate U and V in favor of η:
where:
which represents the atmospheric forcing field. The lateral boundary conditions, namely Eqs. (A7) and (A8), are a consequence of the zero normal fluxes and zero external forcing at the solid vertical coastFootnote 3.
Condition (A11) is a result of the domain-wise integration of the continuity equationFootnote 4 and ensures that mass is conserved globally. From a mathematical perspective, a governing equation with the second-order time derivative needs only two initial conditions, as described by Eqs. (A9) and (A10). However, the solution so obtained may be mathematically valid but not physical. Using Eq. (A11) as an additional constraint helps eliminate any unphysical solutions. Consistently, the domain-wise integral of the forcing field must also vanish:
which can be verified by the domain-wise integration of Eq. (A6) using Eqs. (A7), (A8), and (A11).
Equation (A6) is linear and inhomogeneous. Its solution can be obtained with Duhamel’s principle (e.g., Courant and Hilbert 1962) and the linear superposition principle. With Duhamel’s principle, the inhomogeneous equation can be modified to become homogeneous. With the linear superposition principle, the homogeneous solutions can be superimposed to recover the inhomogeneous solution. The solution to Eq. (A6) is:
where ξ is the solution of the following homogenous equation:
Using the technique of separation of variables (e.g., Myint-U and Debnath 2007), ξ can be expressed as:
where s is a frictional parameter, ω mn are the wave frequencies, and B mn are the Fourier series coefficients; these values are given below:
where 0 ≤ τ ≤ t , m = 1, 2, ⋯ and n = 1, 2, ⋯. The value of the frictional parameter s = κ/(2h) should be chosen such that all ω nm are real valued.
The time convolution expressed by Eq. (A14), together with Eqs. (A20) through (A27), provides a general solution to wave motion in the rectangular domain driven by any type of forcing field, provided that the condition stated by Eq. (A13) holds. As a simple example, consider a forcing field associated with the following wind and air pressure fields:
In this case, the forcing field expressed by Eq. (A12) is reduced to:
where τ x can be calculated from U 10 and V 10 with Eqs. (3) and (4) in part I.
The U 10 specified by Eq. (A28) represents a constant west wind uniformly blowing east within the domain at a speed of 20 m/s. However, the wind speeds at the two boundary ends (i.e., x = 0 and x = a) are set equal to zero; this is because a non-zero wind cannot produce any effect on a vertical concrete wall anyway. The given wind speed is therefore a step function across the solid boundary, so is the associated wind stress. Specifically, from x = 0 to x = ε, where ε is any positive infinitesimal, the wind stress increases from 0 to a positive value, and from x = a − ε to x = a, it decreases from the same positive value to 0. Therefore, based on Eq. (A29), the forcing field F consists of two Dirac δ − functions:
where τ is a constant stress (τ = C d |U 10|U 10 = 2.8 × 10− 3 × 202 = 1.12 m2/s2). It is evident that the above F respects the condition of Eq. (A13).
Substituting the above F into Eqs. (A25) to (A27), we find that:
The homogeneous solution, Eq. (A20), then becomes:
and the inhomogeneous solution, Eq. (A14), can be evaluated as:
The solution shows that the wave motion is independent of y, which is expected because the external forcing is applied only in the x-direction. The absence of y in the solution indicates that the wave motion in the y-direction is uniform.
The slope of the sea surface can be determined by taking the x-derivative of each term of the above series:
When t → ∞, the asymptotic sea surface and its slope become:
In the second steps of the above equations, the following two well-known Fourier series have been used:
which one can verify or easily find references forFootnote 5.
The asymptotic solutions given by Eqs. (A36) and (A37) can also be obtained directly from the original momentum equations. For the constant wind stress considered here, we can immediately see the asymptotic solution from Eq. (A2): the asymptotic u and its time derivative all vanish due to friction. This leaves only the slope in the sea surface balancing the wind stress as t → ∞:
Integrating the above equation, we can determine the asymptotic sea surface as follows:
where the integral constant is determined as const = a/2 in the second step by the global mass conservation ∫ a0 ηdx = 0.
Appendix 2: Convolution of matrices, its definition, and properties
2.1 B1: Definition of convolution of matrices
Equation (19) in part I is copied here for convenience after dropping the subscript “c:”
where f is a forcing vector, which changes in time, as implied by its superscript k. The forcing field consists of the set of the forcing vectors f (k) (k = 0, 1, 2, ⋯, k max). Introduce a matrix F to accommodate the forcing vectors:
A column of F is a time series of the forcing field at a particular spatial point, and a row of F is a forcing vector at a particular time. The arrangement of the temporal-spatial information in F is such that the temporal information is contained in the columns, and the spatial information is contained in the rows. The matrix G shares the same temporal-spatial information arrangement (cf. Eq. (17) and Section 4.1 of part I). The actual time span corresponding to the length of F is equal to L F × Δt F, where Δt F is the time interval that comes with the atmospheric model solutions, which is usually hourly. The actual time span corresponding to the length of G is equal to L G × Δt smp, where Δt smp is the sampling rate used to pre-calculate the matrix G (cf. Section 3.3 of part I). For storm surge problems, Δt smp should be chosen to be the same as Δt F.
The convolution defined by Eq. (B1) can now be re-expressed as:
Using Eqs. (B2) and (B3), we can re-express Eq. (1) as:
Equation (B3) defines a convolution between two matrices. It is an extension from a convolution between two time series (or to say two vectors) because the latter is a special case when there is only one column in G and in F. Because of this connection, the number of rows in G and F will be frequently referred to as the lengths of G and F in the text below.
2.2 B2: Associative and commutative properties of matrices in convolution
2.2.1 Associative property
If G = USV T, the following associative property holds:
This property can be proved as below:
where k = 0, 1, 2, ⋯.
2.2.2 Commutative property
In many textbooks, this property is proved for two vector operands. It also holds when the two operands are matrices. This can be proved as follows:
for any k = 0, 1, 2, ⋯, k max , where m = k − i in the second step, and the positions of G and F T are exchanged with their transposes. The property is thus proved.
2.3 B3: The convolution length and the number of multiplications
Equation (B4) may appear abstract; it may be worthwhile to explain it with a concrete example. Without loss of generality, let us assume that the matrix G has only two rows and the matrix F has three rows, i.e., G = [r 0; r 1] and F = [f (0) f (1) f (2)]T. In this case, based on the definition given in Eq. (B3), we can write Eq. (B4) as:
where η 1, η 2, η 3 and η 4 are the four convolution values at four times (t = Δt, 2Δt, 3Δt, 4Δt, where Δt may equal one hour, for example). The first term of the right-hand side (RHS) represents a train of waves that is set into motion by the first instantaneous forcing vector, F T(:, 1). The second term is another train of waves that is excited by the second instantaneous forcing vector, F T(:, 2), and similarly for the third term. The second instantaneous forcing vector does not exist until the second time step; this is why we see that r 0 and r 1 are shifted down by one position in the column vector in the second term. Similarly, the reason we see two zeros before r 0 and r 1 in the column vector of the third term is because the instantaneous forcing vector F T(:, 3) does not exist until the third time step. Once set into motion, these wave trains become free waves because their associated instantaneous forces no longer exist. The total response is a sum of these free wave trains.
This simple example assumes that there are only three successive forces and that the wave set up by each force only lasts two time steps; therefore, the length of the total response is four time steps (i.e., 2 + 3 − 1 = 4). Generally speaking, the length of the convolution response vector is given by the formula:
where L R is the length of the response vector, L G is the length of the convolution kernel, and L F is the length of the forcing vector. As noted at the end of Appendix 2.B1, the length of a matrix is a synonym for the number of rows of the matrix.
The number of multiplications entailed in Eq. (B13) can be analyzed in this way: G and F have the same number of columns, and let this number be denoted as n. This means that each row times a column on the RHS of the equation (e.g., r 0 × F T(:, 1)) requires n multiplications. There are two such rows and three such columns; the total number of multiplications is thus 2 × 3 × n. In general, the number of multiplications, " # * " eqB3, required to perform a convolution using Eq. (B3) is equal to:
2.4 B4: Convolution theorem
The ASGF convolution can also be performed via FFT/IFFT:
where Ĝ z = fft(G z ), \( {\widehat{\mathbf{F}}}_z=\operatorname{fft}\left({\mathbf{F}}_z\right) \), G z and F z are zero-padded versions of G and F. The symbol “.*” denotes a dot-multiplication operator that multiplies its two operands element by element; and \( \mathrm{sum}\left(\widehat{\mathbf{G}}.\ast \widehat{\mathbf{F},}\kern0.5em 2\right) \) means to sum the product of \( \widehat{\mathbf{G}}.\ast \widehat{\mathbf{F}} \) row-wisely. Note that the result of \( \mathrm{sum}\left(\widehat{\mathbf{G}}.\ast \widehat{\mathbf{F},}\kern0.5em 2\right) \) is a one-column vector. Appropriate zero padding is necessary in FFT so that the tail of the resultant vector will not be folded back to its head. Equation (B16) extends the well-known convolution theorem (e.g., Strang 2007) from vectors to matrices.
Computing a convolution via FFT/IFFT is generally much faster than directly computing it according to the convolution definition. Let us examine how many multiplications are required by Eq. (B16). When an FFT or IFFT is applied to a L R- point vector, it requires L R/2 log2(L R) multiplications if L R is a power of 2. In Eq. (B16), there are two FFT’s applied to G z and F z, both of which have n columns; this requires nL R log2(L R) multiplications. There are also element-by-element multiplications between Ĝ and \( \widehat{\mathbf{F}} \), which requires nL R multiplications; one IFFT is applied to a one-column vector, which is \( \mathrm{sum}\left({\widehat{\mathbf{G}}}_z.\ast {\widehat{\mathbf{F}}}_z,\kern0.5em 2\right) \). This requires L R/2 log2(L R) multiplications. The total number of multiplication is then given by:
The ratio of " # * " eqB3 to " # * " eqB16 is:
where Eq. (B14) has been used. Figure 8 shows how this ratio varies with log2 L R when L G = 72 and n = 408, 622, both of which pertain to the realistic example in Section 2.2. The ratio peaks at log2 L R = 9, after which it monotonically decreases toward zero. The peak ratio is 6.2, which indicates that the calculation of the same convolution using Eq. (B16) requires 5.2 times fewer multiplications than using Eq. (B3). In other words, Eq. (B16) is 5.2 times faster or more efficient than Eq. (B3). After the peak, the ratio decreases but remains above 3 up to log2 L R = 20, which is equivalent to 119.7 years of hourly points. Although not shown in the figure, the ratio changes from above one to below one when log2 L R increases from 41 to 42. This indicates that for all log2 L R < 42, Eq. (B16) is more efficient than Eq. (B3). The reason why the convolution theorem does not win for all L R is because the convolution theorem is applied here to two sets of different length time series, L G and L F, where L G is fixed, and L F increases as L R increases. If L G and L F were of the same length and if both increased with L R, the convolution theorem would always win.
The peak ratio occurs at L R = 29, which therefore should be considered to be the optimal convolution length. Denoting this length as L opt, we can use the length L f = L opt − L G + 1 to divide the entire length of the forcing matrix into m pieces using p = ceil(L F/L f), where “ceil” is a function that rounds its argument to the nearest integer toward infinity. For each piece, we can apply the convolution theorem to obtain a response vector; we can then assemble the piecewise response vectors into a final long response vector. Note that \( \widehat{{\mathbf{G}}_{\mathrm{z}}}=\mathrm{f}\mathrm{f}\mathrm{t}\left({\mathbf{G}}_{\mathrm{z}}\right) \) needs to be computed only once; it can then be used for all of the pieces of the forcing vector (more precisely, the forcing matrix). Denote F (p)z as a zero-padded version of the p’th piece and still use G z to denote a zero-padded version of G but now only to the length of L opt. Then, the piecewise approach requires the following numbers of multiplications: to compute \( \widehat{{\mathbf{G}}_z}=\mathrm{f}\mathrm{f}\mathrm{t}\left({\mathbf{G}}_z\right) \) once requires nL opt/2 log2 L opt multiplications; to compute \( \widehat{{\mathbf{F}}_{\mathrm{z}}^{(p)}}=\mathrm{f}\mathrm{f}\mathrm{t}\left({\mathbf{F}}_{\mathrm{z}}^{(p)}\right) \) p times requires pnL opt/2 log2 L opt multiplications; to compute \( \widehat{{\mathbf{G}}_{\mathrm{z}}}.\ast \widehat{{\mathbf{F}}_{\mathrm{z}}^{(p)}} \) p times requires pnL opt multiplications; and to compute ifft(sum(\( \widehat{{\mathbf{G}}_{\mathrm{z}}}.\ast \widehat{{\mathbf{F}}_{\mathrm{z}}^{(p)}} \),2)) p times requires pL opt/2 log2 L opt multiplications. The total number of multiplications entailed in Eq. (B16) with the optimal piece-wise approach is:
where the following relation is used in the second step:
where T is the time span of the convolution response, and Δt smp is the sampling rate used to pre-calculate the matrix G (cf. Section 3.3 of part I), which is chosen to be the same as the time interval in a given atmospheric forcing field.
The ratio of " # * " eqB3 to the above optimized " # * " eqB16 is equal to:
where the specific values of L G = 72, L opt = 29 and n = 408, 622, which are pertinent to the real storm surge case simulated in Section 2.2, have been substituted in the second step. Equation (B22) is general, whereas Eq. (B23) is specific because different values of L G have different values of L opt.
Table 1 lists the ratios for different values of L F. As shown, the above divide-and-conquer algorithm significantly increases the ratios. For example, when L F = 4380 × 24, the ratio is 11.2, whereas Fig. 8 shows that the ratio is below 4.2 for the same L F (log2 L F = 16.6817).
In the above calculations, L G = 72 has been extensively used, which results in the optimal convolution length of L opt = 29. In general, L opt is a function of L G and can be solved from:
where y = log2(L opt). Table 2 lists the solutions for some values of L G.
2.5 B5: MATLAB function conv_FFT
The MATLAB function called conv_FFT, which is given in Table 3, implements Eq. (B16) via the divide-and-conquer approach.
Appendix 3: Position swap between the singular values and the compressed forcing vector
The following relation can be proven:
where S and s contain the same elements but are arranged differently, so are Ψ and ψ. This relation states that the singular values contained in the diagonal matrix S and the elements in the compressed forcing vector ψ can swap positions after an appropriate re-arrangement of their contents. The appropriate re-arrangement is as follows: s is a column vector consisting of the diagonal elements of S, i.e., s = [s 1 s 2 ⋯ s m ]T; \( {\boldsymbol{\psi}}_{\mathbf{j}}={\left[{\psi}_j^{(0)}\kern0.5em {\psi}_j^{(1)}\kern0.5em \cdots \kern0.5em {\psi}_j^{\left({k}_{max}\right)}\right]}^T \) is a column vector consisting of the time series of ψ j , with j = 1, 2, ⋯, m; and Ψ = [ψ 1 ψ 2 ⋯ ψ m ].
To prove this relation, let us assume for simplicity but without loss of generality that U is of dimensions 2 × 2, so is S
and that the forcing field consists of only two instantaneous vectors, ψ (0) and ψ (1),
In this simple case, from the convolution definition given by Eq. (B1):
where U 1 and U 2 are the first and second columns of U, and ψ 1 and ψ 2 are defined by:
These are vectors in time, whereas those defined in Eq. (C3) are vectors in space. The same procedure demonstrated by this simple example can be used to prove the general relationship of Eq. (C1) by mathematical induction.
Let
Applying the convolution theorem to the column-wise convolutions in Eq. (C9), we obtain:
where Û = fft(U) and \( \widehat{\boldsymbol{\Psi}}=\operatorname{fft}\left(\boldsymbol{\Psi} \right) \) with the necessary zero paddings. The same MATLAB function shown in Appendix 2.B5 can be used to compute C when U and Ψ are input as the first and second arguments to the function, and when the third logical argument, sumcol, is set to be false.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Xu, Z. The all-source Green’s function (ASGF) and its applications to storm surge modeling, part II: from the ASGF convolution to forcing data compression and a regression model. Ocean Dynamics 65, 1761–1778 (2015). https://doi.org/10.1007/s10236-015-0894-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10236-015-0894-y