1 Introduction

Coastal zone inundation and destruction resulting from factors such as increased river discharge, high tides, severe storms, or generation of waves from tectonic events (i.e., tsunami induced by seismic or volcanic activity), occurring individually or in combination [1]. Global climate changes have driven significant local weather modifications, impacting environmental, economic, and social habits [2, 3]. Coastal areas become more vulnerable to severe weather events because of the increase in people living in these areas and the global climate change effects causing sea level rise (SLR), warmer waters, which fuel more energetic storms and hurricanes, and changes in storm patterns [4,5,6].

The vulnerability of coastal systems, whether characterized by low, sandy beaches or high coasts, is also linked to erosion phenomena and subsidence, which can cause instability over time [7], and expose these systems to higher frequency and magnitude of coastal related hazard and exposure, leading to increased coastal flood risk [8], and its impacts on society [9].

Sea storms are events in response to severe weather conditions (i.e., low pressure systems and strong winds) which affect the state of equilibrium at sea [10], inducing the water masses undergo a violent and sudden shift, with subsequent fluctuations in the water level along the coast, in time intervals ranging from a few minutes to a few days (storm surges). The general definition gains more specific nuances when considering the geographical area affected by such intense events. Specifically, factors such as the coastal orientation, geomorphologic features, presence of defense structure, and foreshore vegetation [11], play a significant role in dissipating wave energy, and significantly influence the initial assessment of storm surges and their resulting consequences. However, the impact of climate change is straining coastal systems, causing increasingly severe damage due to these abnormally high water levels [12].

Developing and implementing Early Warning Systems (EWSs) enhances the prevention and preparedness activities that mitigate the effects of disasters on lives, property, and the environment [13].

In coastal areas, the EWSs are becoming increasingly necessary to prevent difficulties in the actual coastal defense work, enabling timely measures to be taken before coastal flooding arrives [14, 15]. To do this, sophisticated numerical models have been designed and developed to simulate coastal hydrodynamic conditions to provide validated and consistent information on changing water levels associated with appropriate alarm thresholds. They usually need the local bathymetry, the weather conditions and the offshore sea state data (i.e., wave characteristics, sea level, etc.) as numerical model input and return the alert level of coastal flood in the coastal urban area.

In extra-tropical climate conditions, as our presented test case study area (Torre del Greco, Campania, Italy), the severe weather-marine events are difficult to predict with local accuracy because of the following key reasons: a) they are profoundly local and influenced by the high-resolution orography and land use; b) the local data assimilation is a crucial modeling issue that needs a pervasive and geographically distributed sensor network; c) simulation and forecast models have to be coupled in a simple, data typed, consistent way to perform large size local scale ensembles; d) finally, the computational and data storage resources are a limiting factor because they have to be dynamically allocated. The research paper presents an user-friendly computation platform based on scientific workflows dedicated to weather/marine event simulation and prediction. In particular, we present the Shoreline Alert Model (SAM), a high-performance, parallel, and hierarchically distributed computational model for coastal run-up simulation and forecasting [16]. SAM delivers simulations, forecasts, and what-if hypotheses about coastal marine ingression and intense flooding events. It is a model component of an EWS greatly supports the relevant authorities’ decisions for both the prevention phase of risk to human life and the subsequent detection phase of damage to strategic infrastructure and the coastal environment.

SAM (Shoreline Alert Model) has been designed to return alarm threshold that may be used for coastal flooding mitigation measures. The computational kernel leverages a hierarchically distributed parallelization schema working on distributed memory and shared memory architectures [17].

SAM represents a critical bridge between wind-driven waves and run-up local models in coastal modeling. Due to bathymetric complexities, wind-driven wave models excel in deep or midwater regions but often need help accurately representing wave dynamics in shallow coastal waters. On the other hand, run-up local models are proficient in simulating coastal inundation from intense weather phenomena but may need more broad applicability for comprehensive coastal management. SAM bridges this gap by integrating the strengths of both models. With high-resolution bathymetric data and advanced numerical techniques, SAM precisely simulates wave processes in shallow coastal waters, considering interactions with the seabed and coastline. It enables SAM to provide detailed forecasts of wave characteristics near the shoreline, empowering coastal scientists, engineers, and policymakers with essential tools for effective coastal management and planning.

SAM has been designed to be scalable, delivering accurate results in a high-performance fashion. It leverages hierarchical parallel processing featuring both distributed memory and shared memory techniques. Although SAM can be executed on a personal computer, it has been designed to achieve the best performance with High-Performance Computing resources enforcing the Many Tasks Parallelism [18] approach.

Considering SAM design, HPC is essential for the shoreline early warning system due to the need to process vast data efficiently for reliable hazard predictions. We motivate the need to leverage HPC to produce accurate coastal flooding and disaster forecasting, enabling faster decision-making. The HPC approach followed in SAM design facilitates advanced early warning systems by integrating weather and wind-drive weave data. However, sensing, monitoring, and decision-support components for comprehensive predictions can be added in the future. SAM is a crucial part of the environmental application running as an operational workflow on the University of Naples “Parthenope”facilities.Footnote 1 This application integrates data from various sources like weather forecasts, sea level measurements, and tide gauges, enhancing understanding and decision-making for coastal hazards.

The rest of the paper is organized as follows: the related work is discussed in Sect. 2; the adopted methodology is described in Sect. 3, followed by the description of the data pre-processing in Sect. 3.3.1. Section 3.3.2 describes the SAM architecture and the adopted parallelization schema, while the evaluation and some preliminary results are described in Sect. 4, together with a real-world SAM application. Finally, Sect. 5 reports the conclusions and future research directions.

2 Related work

EWSs play an important role in coastal planning and management, reducing disaster risks, and are essential for adapting to climate change. Its provide timely and precise information about coastal flooding impacts, helping to preserve lives and prevent significant economic losses [19]. Common strategies in EWSs implementation are based on empirical methods [20, 21], neural networks [22], and numerical models [16, 17]. Most of the relevant literature focuses on capturing past flood events or predicting long-term (years to decades) flood risks for large-scale coastal areas, mainly based on various climate change projections [23]. To our knowledge, a few more research efforts regarding short-term forecasts have yet to be realized. Among others, [24] develop a coastal flooding EWS by integrating existing sea-state monitoring technology, numerical ocean forecasting models, historical database and experiences, as well as computer science, for the Taiwanese coast due to its frequent threat by typhoons. [25] developed an EWS for the shoreline of Imperial Beach (San Diego, California) that provides the total water level by combining predictions of tides and sea-level anomalies with wave run-up estimates. Recently, [12] described the semi-probabilistic XBeach-based coastal EWS system currently used as a Decision Supporting System (DSS) by regional authorities on the decision-making process involving the Emilia Romagna region (Northwest Adriatic and Sea, Italy). The system calculated maximum water levels for each time step and checked the results against predefined thresholds. Thresholds are parameterized by coastal characterization from natural and urbanized shores to evaluate the storm surge indicators [26].

Despite continuous efforts to develop a framework for effective EWSs, further research is required to accurately model nearshore hydrodynamics and simulate coastal inundation, and minimize the intensive computations of numerical model [23]. In this context, artificial intelligence (AI) and data-driven methods are expanding their range of applications quickly, partly because of their ability to swiftly analyze vast datasets and derive valuable, trustworthy insights [27]. Some efforts have been made to model coastal morphodynamics (i.e., shoreline changes, beach realignment, sea level rise, etc.) with the use of AI techniques instead of classical numerical modeling with promising results [28,29,30,31]. Based on this premise, [23] described an EWS to detect potential flooding and consequently improve societal preparedness for flood risks in the coastal area of Rethymno (Island of Crete, Greece). The system was designed, implemented, and validated in hindcast and forecast sea-state data, empirical formulas, wave propagation, numerical hydrodynamic models, and an artificial neural network. The system output is the maximum flow depth for each subarea, which is categorized to determine the corresponding inundation risk. Following this approach, [32] presented a wave-induced flooding EWS to predict wave-overtopping coastal flooding impacts on pedestrians, urban components and buildings, and vehicles in urban areas fronted by sandy beaches of Praia de Faro (southern coast of Portugal), using an approach that combines Bayesian Networks and numerical models.

In this framework, this study aims to present the operational EWS system SAM, forecast potential coastal flooding, and improve societal preparedness for coastal flood risks. The methodological approach proposed herein involves implementing and coupling a suite of numerical weather and sea-state models and empirical formulas.

3 Methodology

The EWS has been configured using a high-performance computing system to manage and run a scientific workflow which comprises the community numerical models Weather Research and Forecasting (WRF) [33], Wavewatch III (WW3) [34], and Shoreline Alert Model (SAM), implementing the empirical approach to evaluate the alert level as a function of the shoreline characteristics using Python programming. The operational system is based on complex data pre-processing, simulation, post-processing, and inter-comparison dataflow, provided by the DagOnStar workflow engine [35] (Fig. 1).

Fig. 1
figure 1

Simplified block diagram of the configured scientific workflow

The workflow starts with the WRF numerical model to forecast the atmospheric forcing needed to force the WW3 model to estimate the offshore waves responsible for the wave characteristics in shallow water. The action of waves on the shoreline depends not only on the conditions of the offshore sea but also on the effects of variations in bathymetry and angle of approach that induce the propagation of waves (i.e., shoaling and refraction processes).

SAM can be considered as a numerical solver that, by processing data through empirical formulas of coastal engineering, is able to quickly provide us with output values. For each positioned transect, SAM will repeat the process using its parallel architecture, which breaks down the procedure to make it faster. Therefore, the system is highly modifiable and able to calculate parameters such as run-up height or overtopping discharge. The results obtained will then be compared with predefined thresholds defined from the historical records of damaging events in the study area, the geomorphology of the coast, and the presence/ absence of coastal defense structures.

3.1 Weather forecasting model

WRF is an atmospheric numerical model that computes 10 m wind fields and other atmospheric forces needed to drive the WW3 offshore wave model, yielding the initial and boundary conditions for the shallow-water wave simulation of wave transformation and run-up. To produce the numerical simulations presented in this paper, we configured the WRF model, initialized with the Global Forecast System (GFS) produced by the National Center for Environmental Prediction (NCEP), with two-way nested computational domains: a coarse domain \(d01_\textrm{WRF}\) covering the whole Europe, an intermediate domain \(d02_\textrm{WRF}\) covering the Italian peninsula, and a fine domain \(d03_\textrm{WRF}\) covering the Campania Region, with 25 km, 5 km and 1 km spatial resolution, respectively.

3.2 Wave forecasting model

Wave simulations were carried out using the WW3 model [36], a third-generation wave model developed at NOAA/NCEP. A two-way nesting approach was applied to configure the WW3 model with a 0.09\(^\circ\) spatial resolution for the coarse domain \(d01_\textrm{WW3}\) on the Mediterranean Sea, 0.03\(^\circ\) for the intermediate domain \(d02_\textrm{WW3}\) covering the Italian seas and 0.01\(^\circ\) for the fine domain \(d03_\textrm{WW3}\) including the sea in front of the Campania region. In coastal regions, where bathymetry plays a crucial role in accurately simulating wave dynamics, the limitation of a 1000-meter resolution becomes apparent. Intricate bathymetric features in the use case area characterize the shallow waters. It delves into complex coastlines that demand higher-resolution models for more precise predictions. With its current resolution, WW3 data may need to adequately capture the nuances of wave behavior near the shore. To address this challenge, a development initiative to enhance the resolution of the WW3 model’s d03 domain is planned and partially implemented. By incorporating EMODNET 2020Footnote 2 bathymetry data at 1/16\(^\circ\) resolution (approximately 115 ms), we aim to significantly improve the model’s fidelity in coastal shallow waters. The integration of higher-resolution bathymetry data into the WW3 model represents a significant step forward in coastal modeling efforts.

The spatial domain \(d01_\textrm{WW3}\) is thus a closed domain forced only by the weather conditions provided by the WRF offline coupled data; therefore, no wave boundary conditions were necessary.

Following the operative procedure described in [37], the WW3 grid point (belonging to the high-resolution \(d03_\textrm{WW3}\) spatial domain) nearest to the studied cross-shore profiles is the considered virtual buoy (VB) for the methodology steps described below.

3.3 Shoreline alert model

SAM is a numerical solver used in coastal flooding contexts to simulate the wave run-up height and consequent coastal flooding. It uses a one-dimensional approach to simulate this physical phenomenon. The Python-based software was designed to be highly modular in the operational contexts as the coastal early warning systems into which SAM can be integrated as a software module due to its ability to manage the alarm thresholds according to the duration and intensity of the forecasted storm event.

The wave condition in the VB (\(H_s\), \(T_m\) and \(D_m\)), the beach slope \(\beta\), the beach width w derived from the cross-shore beach profile, and the bottom depth d represent the main SAM inputs.

SAM is designed to be modular concerning the empirical evaluation of the alert level. Based on the \(\beta\) value, each coastal profile is associated with a shoreline typology: (i) rock-armored structures with narrow surf zones (slope ranging from 1:8 to 1:1); (ii) beach (slope up to 1:10); (iii) structures located in the surf zone (slope up to 1:1); (iv) vertical walls (slope higher than 1:1). For all these scenarios, the evaluation of the wave conditions in shallow water is obtained by the ones given by the WW3 system in VB applying the shoaling and refraction coefficients as a function of bathymetry and wave angle of approach. The relative coefficients \(K_s\) and \(K_r\) are given by:

$$\begin{aligned} K_{s}=\left( \frac{C_{g_{o}}}{C_{g}} \right) ^{\frac{1}{2}}=\left[ \frac{2\cos ^{2}kd}{2kd+\sinh 2kd} \right] ^{\frac{1}{2}} \end{aligned}$$
(1)

\(C_{g_{o}}\) is the celerity in deep water, \(C_g\) is the celerity in intermediate water, and k is the wave number.

$$\begin{aligned} K_{r}=\left( \frac{b_{0} }{b}\right) ^{\frac{1}{2}}=\left[ \frac{1-\sin ^{2}\alpha _{0}\tanh ^{2}kd}{\cos ^{2}\alpha _{0}} \right] ^{-\frac{1}{4}} \end{aligned}$$
(2)

where \(b_0 / b\) is the ratio of the distance between two adjacent wave rays on deep and intermediate water, and \(\alpha _{0}\) is the deep water incidence angle. Finally, the incidence angle on depth d is found from Snell’s law:

$$\begin{aligned} \frac{C}{C_{0}}=\frac{\sin \alpha }{\sin \alpha _{0}}=\frac{L}{L_{0}}=\tanh kd \rightarrow \sin \alpha =\sin \alpha _{0} \tanh kd \end{aligned}$$
(3)

Solving for \(\alpha\) yields:

$$\begin{aligned} \alpha =\arcsin \left( \frac{C}{C_{0}} \sin \alpha _{0} \right) \end{aligned}$$
(4)

To calculate the wave height at a depth of d using the following equation, based on the VB wave height:

$$\begin{aligned} H_{s_i}=K_{s}K_{r}H_{s} \end{aligned}$$
(5)

In the surf zone, the rise of the mean water level at mean depth d was calculated as follows by [38]:

$$\begin{aligned} \bar{\zeta } \left( x \right) =\bar{\zeta _{b}}+\frac{3\gamma _{b}^{2}}{8}\left( 1+\frac{3\gamma _{b}^{2}}{8} \right) \left[ d_{b}-d\left( x \right) \right] \end{aligned}$$
(6)

where \(\bar{\zeta _{b}}\) is the wave set down at the breaking depth, given by:

$$\begin{aligned} \bar{\zeta _{b}}=-\frac{1}{16}\gamma _{b}H_{b} \end{aligned}$$
(7)

where \(H_b\) is breaking wave height and \(\gamma _{b}\) is the breaker index.

For spilling type breakers on dissipative beaches, the assumption commonly employed is that \(\gamma _{b}\) remains a fixed ratio throughout the entire surf zone:

$$\begin{aligned} \gamma _{b}\approx \left( \frac{H}{d} \right) _{b} \end{aligned}$$
(8)

where \(d_b\) is the mean depth at breaking. Various Authors calculated the breaking index. According to [39], the significant breaking index is given by:

$$\begin{aligned} \gamma _{b}=0.56e^{3.5m} \end{aligned}$$
(9)

where m is the slope of the seabed.

The barometric setup was calculated as follows:

$$\begin{aligned} \Delta \zeta = \frac{\Delta P_{a}}{\rho g} \end{aligned}$$
(10)

where \(\Delta P_{a}\) is the pressure variation during the event.

The run-up height \(\textrm{Ru}_{x\%}\) is defined as the wave run-up level, measured vertically from the still water line, which is exceeded by \(x\%\) of the number of incident waves [40]. [41] considered the run-up \(\textrm{Ru}_{2\%}\) as a function of two separate terms to consider the different contributions of the wave setup and swash (the latter term on the left-hand side).

$$\begin{aligned} \textrm{Ru}_{2\%}=1.1\left( 0.35\tan \beta \left( H_{0}L_{0} \right) ^{0.5}+ \frac{\left[ H_{0}L_{0}\left( 0.563\tan \beta ^{2}+0.004 \right) \right] ^{0.5}}{2} \right) \end{aligned}$$
(11)

\(H_0\) is the deep water significant wave height, which can be related to the value of the VB through the ratio of the respective wave celerity \(C_0=L_0/T_m\) and \(C_\textrm{VB}=L_\textrm{VB}/T_{m}\) [42]:

$$\begin{aligned} H_{0}=H_{s}\frac{C_\textrm{VB}}{C_{0}} \end{aligned}$$
(12)

In VB, the wavelength is equal to \(L_\textrm{VB}=(2 \pi )/k\) in which k is the wave-number obtained by the Hunt approximation of the standard dispersion relation [43]:

$$\begin{aligned} \left( kd \right) ^{2}=\left( \frac{\sigma ^{2}d}{g} \right) ^{2}+\frac{\left( \frac{\sigma ^{2}d}{g} \right) }{1+\sum _{n=1}^{\infty }d_{n}\left( \frac{\sigma ^{2}d}{g} \right) ^{n}} \end{aligned}$$
(13)

where \(d_n\) are six constant values given by [43], and \(\sigma\) is the wave frequency.

The wind set up in the coastal sketches where the total effect was high was calculated according to the equations proposed by [44].

The contribution of the run-up and the coastal setup, calculated as the sum of wind, wave, and barometric setup, give the input to the coastal flooding evaluation [45].

3.3.1 SAM cross-shore shoreline profiles methodology

The shoreline cross-shore profiles, or transects, for coastal hazards analysis and subsequent coastal flooding forecasting are determined based on physical factors, such as the changes in topography, bathymetry, shoreline orientation, land cover data (see Fig. 2). To do this, Algorithm 1 has been implemented.

It needs as input data:

  • The coastline in Esri shapefile digital map format (Inp\(_1\));

  • The study area’s digital elevation model (DEM) (Inp\(_2\)) integrating ocean bathymetry and land topography (up to 10 m a.s.l.). It is obtained interpolating, using Kriging approach [46], EMODnet 2020 bathymetry dataset, ranging a final grid spatial resolution of about 25 m.

  • WW3 model spatial lon-lat grid computational domain interpolated on DEM points (Inp\(_3\)).

Fig. 2
figure 2

Setting up of the transects along the Torre del Greco coastline and localization of the beach (white dotted circle) and virtual buoy (red point) involved in the real test case application (color figure online)

For each point in the shapefile (line 5), the Algorithm checks that it is within the desired area of interest and calculates the distance between the current and previous points (lines 6–7). If this distance exceeds a certain threshold, the created segment is divided between the two points considered into as many equispaced segments as the desired distance between them (lines 8–12). Finally, for each point in the new segment, the Algorithm calculates the bearing between the considered point and the previous one to create the transect.

To create the sea profile, we move toward the sea until we encounter a valid value of the WW3 model, while to create the land profile, we move toward the land until we reach a specific set altitude or distance (lines 14–20). Figure 2 shows the created transects using the described procedure.

Algorithm 1
figure a

Calculate transects

3.3.2 Architecture and parallelization scheme

The implemented parallelization scheme is based on different parallel sub-schemes. Each sub-schema is combinable with each other, providing a hierarchical parallelization scheme.

Figure 3 represents the proposed parallelization schema, data flow, and the active software component.

Fig. 3
figure 3

SAM hierarchical parallelization schema. np is the number of processors, nt is the number of threads per process

A standard paradigm in HPC is domain decomposition. The problem size (the number of transects along the Campania coast) is divided into lots and distributed to several executors. Each executor is an instance of a computer program (process) in charge of computing the partition of the problem in its duty. Due to CPUs being composed of more computing cores, each process can decompose part of the problem to each computing core running concurrently (threads). While the threads of the same process share the same memory, processes communicate by exchanging data messages. As demonstrated before in the paper, the problem decomposition makes the overall computing performance remarkable as the problem size increases. In detail, considering a shared and distributed memory scenario, np represents the number of available processors, and nt represents the number of threads used on each processor. Analyzing Fig. 3, in \(P_0\), details the processor-level behavior and in \(P_1\) how the computation is concurrently performed on nt threads. Algorithm 3 represents the entire cycle. It details the implementation of the parallelization scheme.

Algorithm 2
figure b

Calculate alert index

The domain decomposition is performed by \(P_0\) dividing the transects into subsets as \(\Delta _0, \Delta _1, \Delta _np\). The subsets are distributed to each processor \(P_p\). Each processor \(P_p\) has its local particle data pD. For each processor, \(P_p\), the local particles dataset is divided into subsets as \(\Delta _0, \Delta _1, \Delta _nt\) and assigned to each thread \(t \in [0, nt]\). For each thread, \(T_t\) an algorithm (calculate alert indexes) is executed sequentially for each local thread transect data \(tF_n \in pD_t\). \(P_0\) cycle ends gathering alert indexes from each processor. The parallelization scheme (Fig. 3) enables the final user to choose any combination of the following execution models:

  • Single run the single process \(P_0\), calling the procedure calculate_alert_index solves sequentially the problem for all profiles of its domain \(D_t\).

  • Distributed memory run on np processes each process \(P_p, p \in [0, np]\), calling the procedure calculate_alert_index, solves the problem for all transects of its sub-domain pD.

  • Shared memory runs on nt threads using the shared distribution paradigm, a subdomain of data tD is assigned to each thread of the multi-core environment. Each thread for \(t \in [0, T]\), works on a sub-set tD, calling the procedure calculate_alert_index.

Algorithm 3
figure c

Hierarchical/heterogeneous parallelism

4 Test case application and results

To present the SAM’s performance as an EWS in a real case application we select an intense meteo-marine storm event occurred in the Gulf of Naples (Campania, Italy), specifically in the municipality of Torre del Greco (Fig. 2), where the succession of increasingly frequent high-intensity sea storm events is causing long-term damage and compromising the effectiveness of the current coastal defences [47].

The presented test case is relative to a sea storm occurred in 29–30 October 2018 [48] involving a stretch of coastline consists of a beach, located near the harbur (Fig. 4a), varying in width from 10 to 20 ms, which has suffered from coastal flooding, despite the presence of an artificial defense work of natural blocks that was heavily damaged during the storm (Fig. 4b–d), and creates problems for the commercial activities behind the shoreline.

To evaluate the performance of SAM in a real case study application of which the causes and effects had already been noted (photos, videos, measurements, etc.), we simulated the storm event using the scientific workflow WRF-WW3-SAM which showed satisfactory performance in terms of prediction accuracy (Fig. 4). We subsequently compared the forecast of the alert levels with the damage estimated and documented in the days following the storm. From this it can be stated that, although it is a prototype system, the warning forecasts proved to be true. In particular, the coastal stretch for which SAM had foreseen a high alert level actually suffered the most extensive damage.

Fig. 4
figure 4

a Results of the SAM real test case application to the coastal stretch adjacent to the port of Torre del Greco during the sea storm event of 29–30 October 2018. SAM forecast highlighted 3 different alert levels: High (red), Intermediate (orange), Moderate (Yellow); b Damage to promenade behind the breakwater is limited due to the impacts of the waves (yellow zone); c Increase in damage to the coastline due to the detachment of breakwater armor units and widespread overtopping phenomena (red zone); Flooding phenomena and damage detected in the infrastructure close to the promenade (red zone) (color figure online)

To test the SAM performance, we set up a test-bed experiment using one computing node of the purpleJeansFootnote 3 an HPC system equipped with two Intel(R) Xeon(R) Gold 5218 CPU @ 2.30 GHz (16 cores each). The number of transects considered was about 20K along the entire Campanian coast. However, this number can be further increased by reducing the distance between points in the shapefile (as described in algorithm 1). Furthermore, we consider different processors hosted on the same computing node, avoiding inter-node communications. Finally, we use the following configurations:

  • Baseline we considered as Baseline the performance measured using only one process and one thread.

  • Distributed Memory (MPI approach) SAM has been tested using 1, 2, 4, and 8 MPI processes on the same computing node.

  • Shared Memory (OpenMP approach) Considering only one MPI process, we used 1, 2, 4, and 8 threads.

  • Distributed Memory and Shared Memory(MPI-OpenMP approach) for this evaluation, we run 1, 2, 4, and 8 MPI processes on the same computing node, varying the number of threads from 1 to 8.

Figure 5 presents a comprehensive analysis of three distinct parallel computing approaches for SAM. The subfigures showcase the runtime performances of shared memory (Open MP), distributed memory (MPI), and a hybrid memory model (Open MP-MPI).

Fig. 5
figure 5

The execution time of the a Open MP, b MPI, and c OpenMP-MPI approaches

The baseline performance measured in Fig. 5 is the execution time of SAM related to a base configuration consisting of one process and one thread (P1T1).

Figure 5a illustrates the behavior of the shared memory approach utilizing OpenMP. The blue bars represent the execution time with varying numbers of threads. Notably, as the number of threads increases from 1 to 8 (1P2T to 1P8T, where P stands for processes and T for threads), there is a consistent decrease in runtime. This suggests that the shared memory model scales well with additional threads. Compared to the Baseline, which is an efficient sequential execution not employing parallelism, the addition of parallel processes significantly enhances performance. The largest performance gain is observed when moving from sequential to two threads (1P2T), indicating that even minimal parallelism can drastically reduce execution times.

In Fig. 5b, the performance of the distributed memory approach via MPI is depicted. The purple bars represent the execution time, which, similar to the shared memory approach, shows a decline as the number of MPI processes increases. This reduction from 2P1T to 8P1T implies that the application scales well in a distributed environment. It’s particularly interesting to note that while the distributed memory model does not match the speedup of the shared memory model, it nonetheless shows substantial improvement over the baseline.

Figure 5c presents the most striking results: the hybrid approach, combining both OpenMP and MPI. The red bars highlight the execution time for each configuration, revealing a notable trend: the more combined processes and threads utilized, the shorter the runtime, culminating in the fastest execution time at 8P8T. This subfigure is critical as it demonstrates that the hybrid approach not only benefits from both shared and distributed memory advantages but also produces a synergistic effect that significantly outperforms the individual parallel computing models. It confirms the premise that the hybrid model is the most efficient, achieving an impressive 24-fold speedup compared to the baseline.

The overarching conclusion from Fig. 5 is that parallel computing significantly improves performance over sequential execution. The shared memory model provides a strong foundation for speedup through thread parallelism. The distributed memory model demonstrates the capability to scale across discrete computational resources. However, it is the hybrid model that shows the most promising results, leveraging the strengths of both shared and distributed memory. This hybrid approach not only accelerates computation but also showcases the potential for optimizing large-scale applications that require the collaborative power of multi-threading and process distribution.

These results indicate the potential for considerable efficiency gains in high-performance computing applications, particularly when implementing a hybrid approach that effectively utilizes both OpenMP and MPI. This can lead to more responsive and scalable systems, paving the way for more complex and computationally intensive tasks to be performed in shorter timeframes.

Fig. 6
figure 6

Speed-up of the a Open MP, b MPI, and c OpenMP-MPI approaches

Figure 6 shows a visual comparison of the speed-up achieved using MPI, OpenMP, and a hybrid of both MPI and OpenMP in parallel computing. The speed-up is quantified as a factor of improvement over the baseline.

  • Speed-up OpenMP This graph shows the speed-up results for parallel execution using only OpenMP, which allows for shared memory multiprocessing. There is an incremental gains as the number of threads rises from 1P2T to 1P4T and then to 1P8T. The curve’s progression indicates that the application benefits from additional threads within the same process, taking advantage of shared memory for faster data access and reducing runtime. However, the curve begins to flatten going towards 8 threads, hinting at a potential ceiling where adding more threads may lead to less significant gains, likely due to overheads such as thread contention and synchronization challenges.

  • Speed-up MPI This graph showcases the speed-up achieved as the number of MPI processes increases. Starting from the baseline, the speed-up shows a consistent upward trajectory as the number of processes doubles from 2P1T to 4P1T, and then to 8P1T. This progressive improvement reflects the efficiency of parallelizing the task across multiple processors using MPI. The graph’s upward trend suggests that the application can effectively utilize additional distributed resources to reduce computation time.

  • Speed-up OpenMP + MPI A dramatic increase in speed-up is observed when employing a hybrid parallel computing approach, combining OpenMP and MPI. The speed-up factor leaps significantly with each increase in the number of combined processes and threads, moving from 1P8T to 2P8T, then to 4P8T, and peaking at 8P8T. This graph illustrates the synergy that can be achieved when shared and distributed memory models are used in concert, leading to a super-linear speed-up, which surpasses the speed-up of individual parallelization strategies. The steep incline indicates an efficient scaling that effectively leverages multicore processors while managing the inter-process communication adeptly.

5 Conclusion

In this study, we have configured a scientific workflow executing WRF, WW3 and SAM numerical models using HPC to manage and operate a coastal flood warning system. The development and the implementation this next-generation toolset for facing intense weather events are crucial as they allow decision-makers, for timely preventive measures to be taken, before the arrival of floodwaters, and mitigate the effects of disasters on human lives, property, and the environment. In this application context, the timeliness of the model is important in management or emergency responses activities, making the acceleration of computing times a crucial issue [49].

SAM core is capable of processing the exponential increase in input data. This capability, combined with the model’s flexibility, makes it a decision support tool capable of adapting to any coastal area and promptly providing the results of computations. The cloud-native/hierarchical heterogeneous HPC enabled SAM marine ingression/flooding forecasting model. The proposed parallelization model is based on different parallel sub-schemes. Each parallelization sub-schema can be combinable with each other, providing a hierarchical parallelization scheme. As demonstrated before, the problem decomposition makes the overall computing performance remarkable as the problem size increases.

In response to the challenge of accurately simulating wave dynamics in coastal shallow waters, a development initiative is underway to enhance the resolution of the WW3 model’s \(d03_\textrm{WW3}\) domain. This initiative aims to improve the model’s fidelity by incorporating high-resolution bathymetry data [50, 51]. Specifically, we plan to integrate EMODNET 2020 bathymetry data at a resolution of 1/16\(^\circ\) (approximately 115 ms) into the WW3 model. This enhancement is expected to significantly enhance the model’s capability to simulate wave processes in coastal shallow waters with greater accuracy.

Alongside the higher resolution for the WW3 domains, we plan to use locally improved bathymetry for SAM to perform an accurate scalability evaluation to better frame the behavior of the proposed solution as a critical tool for a fast, effective, and precise response for intense weather events.

According to the most recent research, merging the classical computational environmental science tool with the AI prediction models and leveraging crowdsourced data will significantly reduce the social and economic costs of intense weather events’ management responses, saving human life and production/business assets [52].

At the current design and development stage, SAM can be considered an operational prototype worthy of improvements and investigation. As short-term future research, the SAM computational kernel can be extended to support the computational malleability provided by Flex-MPI [53] to get an advantage by application-dedicated HPC cluster or HPC environment for on-demand computation [54]. A more ambitious goal is coupling SAM with a deep learning model to predict the potential damages in terms of costs by forecasting the possible intense weather/marine event [55].