1 Introduction

Probabilistic tsunami hazard assessment (PTHA) is an important concept in dealing with tsunami hazards and must consider factors such as inundation depth and velocity, which have large uncertainties. PTHAs have made remarkable progress in the past few decades, and particularly in recent years, an increasing number of study cases of probabilistic tsunami inundation assessment have been conducted on land (e.g., Gonz´alez et al. 2009; Mueller et al. 2014; Lorito et al. 2015; Goda and Song 2016; Park and Cox 2016; Fukutani et al. 2021; Davies et al. 2022). Probabilistic tsunami inundation assessment on land requires many inundation simulations that consider various uncertainties; thus, the computational cost is much higher than that for offshore evaluations. In recent years, many methods have been proposed to reduce the cost of tsunami simulations. Lorito et al. (2015) and Volpe et al. (2019) applied a two-stage filtering procedure to select a reduced set of sources and calculate nonlinear probabilistic inundation maps. They proposed a unique and effective method to evaluate the probabilistic tsunami inundation depth. Rohmer et al. (2018) proposed a Bayesian procedure to infer the probability distribution of the source parameters of an earthquake to overcome the high computation time of the numerical simulator. Sepúlveda et al. (2017) used the stochastic reduced-order model (SROM), which enables the selection and weighting of a subset of random scenarios. They suggested that the proposed method provided a more efficient representation of tsunami variability than that derived from Monte Carlo sampling. Williamson et al. (2020) proposed sampling techniques to adequately sample the tails of the distribution and properly reweight the probability assigned to the resulting realizations and by grouping the realizations into a small number of clusters that will give similar inundation patterns in the region of interest.

In recent years, there has also been active research on methods to reduce computational costs for probabilistic tsunami hazard assessment by constructing an emulator that uses a surrogate model to approximate a deterministic response by a physical model (e.g., Sarri et al. 2012; Sraj et al. 2014; Behrens and Dias 2015; de Baar and Roberts 2017; Guillas et al. 2018; Denamiel et al. 2019; Gopinathan et al. 2020, 2021; Snelling et al. 2020; Ehara and Guillas 2021; Fukutani et al. 2021; Giles et al. 2021; Salmanidou et al. 2021; Tozato et al. 2022). de Baar and Roberts (2017) used multifidelity sparse grid interpolation to propagate the uncertainty in the shape of the incoming wave. They proposed a method to combine results from a small number of accurate high-fidelity simulations with a large number of less accurate low-fidelity simulations. Guillas et al. (2018) developed a functional emulator that efficiently and parsimoniously approximates tsunami wave forms, making use of functional principal component analysis (FPCA). Giles et al. (2021) approximated functionally complex and computationally expensive high-resolution tsunami simulations with a simple and inexpensive statistical Gaussian process emulator (GPE). Salmanidou et al. (2021) used the efficient mutual information for computer experiments (MICE) by Beck and Guillas (2016) for construction of a GPE of the tsunami model. Gopinathan et al. (2020, 2021) also created a GPE of the tsunami model. To reduce the computational burden, they used a localized nonuniform unstructured mesh. Ehara and Guillas (2021) presented the multilevel adaptive sequential design of computer experiments (MLASCE) in the framework of GPE to model tsunami risk. They succeeded in efficiently allocating limited computational resources over simulations of different levels of fidelity. However, none of these approaches have been applied to wide two-dimensional tsunami runup simulations that target land inundation. Tozato et al. (2022) and Fukutani et al. (2021) achieved a significant reduction in the number of required tsunami simulations for probabilistic tsunami inundation assessment by extracting spatial correlations of inundation depth distribution with proper orthogonal decomposition (POD).

When considering the hazard assessment of tsunami inundation depths on land, the treatment of coastal seawalls is not straightforward because the inundation depth and its distribution due to runup tsunamis change dramatically depending on whether the seawalls are destroyed. If they are destroyed, the inundation situation will also change depending on the extent of seawall failure. In the 2011 Tohoku tsunami, the tremendous tsunami wave force collapsed and destroyed many of the seawalls along the Pacific coast of the Tohoku region (e.g., Kato et al. 2012; Mase et al. 2013; Yeh et al. 2013). In Japan, it is common to assume that seawalls are destroyed after a tsunami overtops them when the maximum class of tsunamis is assessed, as in the case of drawing hazard maps that are intended to provide information for the evacuation of residents. Thus, when conducting PTHA that considers various types of uncertainties for tsunami hazards, it is important to evaluate the uncertainty of seawall failure in the event of a large earthquake and tsunami by setting a failure probability for the seawalls. This perspective is very important when considering tsunami hazard assessment in areas where seawalls have been installed. However, there have been no studies in which PTHA has been conducted by incorporating the failure probability of seawalls.

This study proposes a method to improve the efficiency of probabilistic tsunami inundation assessment by decomposing the tsunami inundation distribution using POD, considering the randomness of the tsunami sources and the uncertainty of the seawall height in the event of a large earthquake and tsunami, which has a particularly substantial impact on the tsunami inundation depth distribution. Based on the methods of Tozato et al. (2022) and Fukutani et al. (2021), a new probabilistic evaluation method is proposed in the present study that additionally considers the uncertainty of random tsunami sources and seawall height.

The remainder of this manuscript is organized as follows: an overview of the proposed methodology and a flowchart of this study are presented in Sect. 2. Section 3.1 describes the random tsunami sources and tsunami numerical simulation process. POD for the tsunami inundation depth distribution is introduced in Sect. 3.2. Section 3.3 presents Monte Carlo sample generation using tsunami inundation modes. The number of tsunami sources to be evaluated in PTHA is considered in Sect. 3.4. Section 4.1 describes the seawall setting for tsunami numerical simulations. Tsunami inundation distributions using inundation modes considering random tsunami sources and the seawall failure probability are examined in Sect. 4.2. Section 4.3 presents the results and discussion. The probabilistic tsunami inundation assessment is described in Sects. 5.1 and 5.2. Finally, the study concludes in Sect. 6.

2 Flowchart of the probabilistic tsunami inundation assessment method

Figure 1 shows a flowchart of the probabilistic tsunami inundation assessment method proposed in this study. The numbers within the two lines in Fig. 1 represent the section numbers.

Fig. 1
figure 1

Flowchart of the probabilistic tsunami inundation assessment method. The numbers within the two lines represent the section numbers

Initially, \(N\) tsunami sources are used at each moment magnitude (hereafter Mw) in accordance with Goda et al. (2016), which are randomly placed slip distributions of megathrust earthquakes, and the corresponding tsunami inundation depth distribution is obtained for a target area by solving nonlinear shallow water equations. To examine whether the number of assumed tsunami sources can be reduced when considering these many distributions of tsunami inundation depths, singular value decomposition (SVD) derived from POD is applied to the inundation distribution obtained by reducing the number of tsunami sources; many inundation distributions are regenerated by applying random numbers to the right-singular vector \(\mathrm{W}\); then, the statistics of the generated inundation distribution are compared with those of the original distributions. The statistical preconditioning method is applied to generate the random right-singular vector.

Next, \(M\) tsunami sources are selected that can accurately reproduce the original inundation depth distribution, and then the inundation distribution that considers the uncertainty of seawall failure is evaluated. First, the inundation distributions with no artificial structures and the inundation distribution assuming multiple heights of seawalls are simulated by nonlinear shallow water equations. To obtain pseudo inundation depth distributions by mode decomposition and mode coupling of the inundation depth distributions as implemented above, a failure probability of seawalls is introduced, and the coefficients are estimated by Gaussian process regression (GPR). Then, a Brownian passage time (BPT) distribution with nonstationary occurrence probability is applied for the assumed megathrust earthquakes, which is used to determine whether an earthquake can occur every year; then, based on the Gutenberg-Richter law, the Mw of any earthquake that will occur is predicted. Monte Carlo simulations are performed to repeat this process, and finally, the evaluation results of the probability distribution of tsunami inundation depths greater than 1.0 m are presented for the next 50 years and for the next 10 years.

3 Investigating a reduction in the number of assumed tsunami sources using inundation distribution modes

This section examines, through the results of mode decomposition and coupling, whether the number of assumed tsunami sources can be reduced when considering the areal distribution of many inundation depths obtained from the assumed tsunami sources.

3.1 Random tsunami sources and tsunami numerical simulations

Initially, the present study employed the random source model by Goda et al. (2020) to obtain random slip distributions on the earthquake fault for the tsunami source region of the Nankai Trough megathrust earthquake published by the Central Disaster Management Council of the Japanese Cabinet Office (2013). The study by De Risi and Goda (2016) discussed a sufficient number of random earthquake faults based on estimated tsunami wave height as a function of the number of tsunami simulations and indicated that the 50th percentile curves are stable after 100 numerical simulations for all the considered magnitude values. The present study followed their study and generated 100 tsunami sources with random slip distributions at each Mw (9.0, 8.8, 8.6, 8.4, 8.2, and 8.0) for a total of 600 earthquake faults.

Figure 2 shows the domains of tsunami numerical simulations for the case analysis targeting a town in Japan. The town faces the Pacific Ocean side of the Japanese islands where the Nankai Trough megathrust earthquake is located and is thus considered a town with high tsunami risk. Figure 2c shows the target area for the probabilistic tsunami inundation assessment. A river flows through the centre of the town, and there are hills with elevations of 5 to 10 m from the coastal area in the north to the inland area and mountains with elevations exceeding 150 m in the south. Tsunamis from the coast mainly flow upstream from the rivers in the centre of the town. Table 1 shows the numerical simulation conditions. With the initial water level as input data evaluated using the theory of Okada (1985), tsunami numerical simulations were performed via the continuity equation and nonlinear shallow water equations with grid spacings of 810 m–270 m–90 m–30 m–10 m. The time interval for 810 m is set to 1.2 s, and then the interval is divided into thirds as the mesh becomes finer. The spatial grids and the time interval were set up to satisfy the Courant–Friedrichs–Lewy (CFL) condition. The tide level in the coastal area of the town was set at + T.P. 0.92 m, which is the mean monthly highest water level around the area. T.P. is the abbreviation for Tokyo Peil, which is the mean sea level in Tokyo Bay. The topographic and Manning coefficient data published by the Central Disaster Management Council of the Japanese Cabinet Office (2013) were used in numerical simulations, and tsunami numerical simulations were performed for the first 4 h after earthquake generation so that the maximum tsunami inundation depth could be properly evaluated while considering the effects of tsunami reflection and amplification. With random tsunami sources and tsunami numerical simulations, 100 inundation depth distributions at each earthquake Mw were obtained for the target area.

Fig. 2
figure 2

Five tsunami simulation domains for a town in Japan facing the Pacific Ocean. (a) Domain 1 (810 m mesh)–Domain 2 (270 m mesh)–Domain 3 (90 m mesh), (b) Domain 3 (90 m mesh)–Domain 4 (30 m mesh)–Domain 5 (10 m mesh). (c) Domain 5 (10 m mesh). The black and colour shading in the maps indicates topography (m)

Table 1 Calculation conditions for the tsunami numerical simulation

3.2 Tsunami inundation modes

The inundation depth matrix at each mesh point \(i\) calculated by the nonlinear shallow water equations was defined as \({\varvec{x}}={{x}_{i}}^{T}(i=1,\dots ,I)\). In this study, the finest area for which the tsunami inundation depth was simulated is a 10 m mesh, which is 200 \(\times\) 250 in size; thus, \(I\) = 50,000. The following inundation data matrix \(\mathbf{X}\) is generated considering the randomness of the slip distribution of tsunami sources for each analysis case \((j=1,\dots ,J)\). The numbers of random tsunami sources for each Mw are 100; thus, \(J\) = 100:

$${\mathbf{X}} = x_{ij} = \left( {\begin{array}{*{20}c} {x_{11} } & \ldots & {x_{1J} } \\ \ldots & \ldots & \ldots \\ {x_{I1} } & \ldots & {x_{IJ} } \\ \end{array} } \right)$$
(1)

where, \(\mathbf{X}\) is an \(I\times J\) matrix and is often a nonsquare matrix. Before implementing mode decomposition, the data matrix \(\mathbf{X}\) is standardized into a matrix \({\mathbf{X}}_{0}\) using the following equation.

$${\mathbf{X}}_{0} = \frac{{{\mathbf{X}} - \mu_{G} {\mathbf{H}}}}{{\sigma_{G} }}$$
(2)

where \(\mathbf{H}\) is an \(I\times J\) matrix whose elements are all one, and \({\mu }_{G}\) and \({\sigma }_{G}\) are the mean and standard deviation of the \(J\) case, respectively. Then, \({\mathbf{X}}_{0}\) is decomposed as follows:

$$X_{0} = U\Sigma V^{T} = \left( {\begin{array}{*{20}c} | & \ldots & | \\ {u_{1} } & \ldots & {u_{J} } \\ | & \ldots & | \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {\lambda_{1} } & \ldots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \ldots & {\lambda_{J} } \\ \end{array} } \right)\left( {\begin{array}{*{20}c} - & {v_{1} } & - \\ \vdots & \vdots & \vdots \\ - & {v_{J} } & - \\ \end{array} } \right)^{T}$$
(3)

where, \(\mathbf{U}\) is an \(I\times J\) orthonormal matrix containing the left-singular vectors \({{\varvec{u}}}_{j}\), \({\varvec{\Sigma}}\) is an \(J\times J\) pseudodiagonal and semipositive definite matrix with diagonal entries containing the singular value \({{\varvec{\lambda}}}_{j}\), and \(\mathbf{V}\) is an \(J\times J\) ortmal matrix containing the right-singular vector \({{\varvec{v}}}_{j}\). Figure 3 shows the spatial distribution of the left-singular vectors \({{\varvec{u}}}_{j}\) (j = 1, …, 6) for the Mw 9.0 case only, while 100 mode distributions corresponding to the numbers of tsunami inundation simulations can be identified. The distribution of the first mode (Mode 1) shows that coastal areas and lowland areas around rivers have positive values, while other areas have negative values. Thus, in Mode 1, the inundation depth is positively correlated between the coastal areas and the lowland areas around rivers, as well as between the other areas. The distribution of the second mode (Mode 2) shows that the values are negative in coastal areas and positive in the other areas; thus, there is an inverse correlation between the coastal areas and other areas. The spatial distributions from the third mode (Mode 3) onwards are complex, with sparsely distributed positive and negative values.

Fig. 3
figure 3

Spatial distributions of the left-singular vectors \({{\varvec{u}}}_{j}\) (\(j\) = 1, …, 6) for only the Mw 9.0 case corresponding to each mode. Positive values are shown in red, and negative values are shown in blue, which means positive correlations of the inundation depth between meshes of the same sign and a negative correlation of the inundation depth between meshes of different signs

3.3 Monte Carlo sample generation using tsunami inundation modes

The left-singular vectors \(\mathbf{U}\) have information on the spatial correlation of each inundation mode. Retaining the left-singular vector \(\mathbf{U}\) and singular vectors \({\varvec{\Sigma}}\) and replacing the right-singular vector \(\mathbf{V}\) with another \(J\)-dimensional vector allows a new sample of tsunami inundation distributions to be generated while keeping information on the spatial correlation of the original inundation depths. Defining the number of trials as \(N\) and the matrix \({\mathbf{Y}}_{0}\) obtained by replacing the \(J\) × \(J\) matrix \(\mathbf{V}\) in Eq. 3 with the \(N\times J\) matrix \(\mathbf{W}\) yields

$${\mathbf{Y}}_{0} = {\mathbf{U\Sigma W}}^{{\mathbf{T}}}$$
(4)
$${\mathbf{W}} = \left( {\begin{array}{*{20}c} {{\varvec{w}}_{1} } \\ \vdots \\ {{\varvec{w}}_{N} } \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {w_{11} } & \cdots & {w_{1J} } \\ \vdots & \ddots & \vdots \\ {w_{N1} } & \cdots & {w_{NJ} } \\ \end{array} } \right)$$
(5)

Here, the condition that the matrix \(\mathbf{W}\) must satisfy in relation to the covariance matrix is given in the following equation (Nojima and Kuse 2018):

$$\frac{{{\mathbf{W}}^{{\text{T}}} {\mathbf{W}}}}{N} = \frac{{\mathbf{E}}}{J}$$
(6)

\(\mathbf{E}\) is a \(J\) × \(J\) identity matrix. The matrix \(\mathbf{W}\) satisfying Eq. 6 can be generated by statistical preconditioning to obtain many samples of the inundation distribution. Statistical preconditioning is a method of obtaining a given covariance matrix with a small number of samples by systematic sampling using the periodicity and orthogonality of the cosine function. Specifically, the matrix \(\mathbf{W}\) is generated using the following equation (Yamazaki and Shinozuka 1990):

$${\mathbf{W}} = w_{{ij}} = \sqrt {\frac{2}{{N_{f} }}} \sum\nolimits_{{m = 1}}^{{N_{f} }} \begin{gathered} {\text{cos}}\;(\omega _{k} i + \psi _{k} )/\sqrt J {\text{~}}\; \hfill \\ \left( {i = 1,\; \ldots N,\;{\text{~}}j = 1,\; \ldots ,\;J} \right) \hfill \\ \end{gathered}$$
(7)
$$N = 4JN_{f}$$
(8)
$$\omega_{k} = \frac{2\pi k}{N} \left( {k = \left( {m - 1} \right)J + j} \right)$$
(9)

where, \({N}_{f}\) = the number of cosines to be added, \({\psi }_{k}\) = a random phase angle uniformly distributed between 0 and 2π, and \({\omega }_{k}\) = the kth circular frequency. Finally, the pseudo tsunami inundation distribution \(\mathbf{Y}\) is obtained using the mean value \({\mu }_{G}\) and standard deviation \({\sigma }_{G}\) used in the standardization:

$${\mathbf{Y}} = \mu_{G} {\mathbf{H^{\prime}}} + \sigma_{G} {\mathbf{Y}}_{0}$$
(10)

where, \(\mathbf{H}\mathbf{^{\prime}}\) is an \(I\times N\) matrix whose elements are all one.

Applying the above procedure to the tsunami inundation distributions obtained from 50, 20, and 10 tsunami sources and comparing the statistics of the generated inundation distribution samples with those obtained from the original 100 tsunami sources provides the information needed to reduce the number of tsunami sources to be considered in the target area. Figure 4 shows the mean tsunami inundation depth distribution of the original and Monte Carlo samples for earthquake magnitudes Mw 9.0 and Mw 8.4, assuming \({N}_{f}=16\) in the statistical preconditioning. The original means (Fig. 4(a) and (a’)) are the averages of 100 analyses, and the sample means (Fig. 4(b, c, d) and 4(b’, c’, d’)) are the averages of 6,400 analysed values from Eq. 8. In the Mw 9.0 results, the results with 50 tsunami sources (N = 50) are slightly overestimated overall, and the results with 10 tsunami sources (N = 10) are slightly underestimated overall. Figure 5 shows the results of the two-sample t test (p values) based on the null hypothesis that the frequency distribution of the original inundation depth and that of the inundation depth generated via POD are equal in each mesh. It can be observed that the p values are small at many mesh points for N = 50 and N = 10, and the null hypothesis can thus be rejected. However, the p values are large overall, and the null hypothesis cannot be rejected for N = 20, indicating that the frequency distribution of the generated inundation depth for N = 20 is similar to that of the original inundation depth. In contrast, the results for Mw 8.4 are not significantly different among the 50, 20, and 10 tsunami sources, and all of them obtain an inundation distribution that is close to the original mean.

Fig. 4
figure 4

Comparison of tsunami inundation depth distributions between the original means and the Monte Carlo sample means. The upper panel shows the results for Mw 9.0, and the bottom panel shows the results for Mw 8.4. (a) and (a’) indicate the original means derived from the tsunami simulation results using 100 random tsunami sources, (b) and (b’) the Monte Carlo sample means using 50 tsunami sources (N = 50), (c) and (c’) using 20 tsunami sources (N = 20), and (d) and (d’) using 10 tsunami sources (N = 10)

Fig. 5
figure 5

Results of the two-sample t test (ip values) for (a) N = 50, (b) N = 20 and (c) N = 10

Figure 6 shows the frequency distribution of inundation depths and the mean, standard deviation, and maximum values at coastal point A (Matsubara Beach). Figure 7 shows inland low-lying point B. The locations of points A and B are shown in Fig. 2(c). The objective here is to construct a model that can reproduce the inundation depth distribution in the target area and the shape of the original probability distribution of the inundation depth at each point, rather than to precisely assess representative values such as the mean, standard deviation and maximum value at each point. However, the inundation depth distributions could be reproduced relatively well in all cases, representative values such as the mean, standard deviation and maximum value were identified for only two points, namely, coastal point A and inland low-lying point B, and cases that yielded values close to these representative values were adopted for analysis. It should be noted that the representative values at each point may be subject to errors that occur with the acquisition of Monte Carlo samples. Future research should model these errors and correct the POD results. At coastal point A, the results for N = 50, 20, and 10 all reproduce the mean, standard deviation, and maximum values of the original well. However, the original shows that the maximum value decreases as Mw also decreases when considering the maximum value, but this trend is captured only in the results for N = 20. The same is true for point B. The basic trend is well reproduced. These results illustrate that the original results could be reproduced well when pseudo inundation distributions are generated, even with the number of tsunami sources N = 20.

Fig. 6
figure 6

Comparison of tsunami inundation depth at coastal point A in Fig. 2(c) between the originals and the Monte Carlo samples. The upper histograms show the tsunami inundation depth frequencies derived from Mw 9.0 tsunami sources for (a) the originals (N = 100), (b) samples (N = 50), (c) samples (N = 20), and (d) samples (N = 10), and the bottom histograms show the tsunami inundation depth frequencies derived from Mw 8.4 tsunami sources for (a’) the originals (N = 100), (b’) samples (N = 50), (c’) samples (N = 20), and (d’) samples (N = 10). The bar charts below show (e) means, (f) standard deviation and (g) maximum tsunami inundation depth for each earthquake magnitude

Fig. 7
figure 7

Comparison of tsunami inundation depth at point B in Fig. 2(c) between the originals and the Monte Carlo samples. The objects shown in the histograms and the bar charts are the same as those in Fig. 5

3.4 Target tsunami sources for PTHA

The previous section confirmed that the original results can be reproduced even if the number of tsunami sources is 20, which was equivalent to 1/5 of the original number of tsunami sources. The results of the analysis for the 20 fixed predetermined tsunami sources were shown, but the results are expected to vary depending on which 20 sources are selected. The effects of randomly selecting these 20 sources are discussed below.

Although the number of all combinations of 20 sources selected from 100 sources is 100C20, only 1,000 trials of selecting 20 sources at random are performed, and the results are examined. Figure 8 shows the mean, standard deviation, and maximum results at points A (Fig. 8(a, b)) and B (Fig. 8(c, d)) for Mw 9.0 and Mw 8.4. The dots are the results of 1,000 trials, and the solid line is the original value. All 1,000 trials are found to vary around the original value, but the variation in the mean and standard deviation is relatively small, while the maximum value varies relatively widely. The results for Mw 8.4, the maximum value at point B, show that the original is close to the upper limit of variation for the trials, suggesting that the simulation results of this model are likely to be evaluated lower than the original inundation depth at relatively low inundation depths. The results for Mw 8.2 and Mw 8.0 at point B showed a similar trend.

Fig. 8
figure 8

Trial values for the mean, standard deviation, minimum and maximum for (a) Mw 9.0 and (b) Mw 8.4 at point A. Similarly, the results for (c) Mw 9.0 and (d) Mw 8.4 at point B are shown

The same analysis was performed for all Mw cases, and 20 sources that could generate values close to the mean and standard deviation of the original data for each Mw were selected. In other words, the computational cost was reduced to 20% of that in the case of the original 100 tsunami sources. This was achieved because the spatial correlation of the inundation depth distribution in the target area was successfully extracted via POD and SVD. If a single tsunami source (an earthquake) were assumed and nonlinear long wave theory were used to calculate 4 h of land runup, it would take more than one day on a single CPU with the current typical approach. Therefore, a calculation involving 600 cases (100 cases/Mw × 6 Mw), as in this study, would require 600 days. Naturally, the computation time could be reduced if multiple CPUs were used to perform parallel computations. POD and SVD are methods that apply Eqs. 3 and 4, respectively, to the tsunami numerical simulation results, which could be performed instantaneously, as long as calculation results are available. In this case, tsunami numerical simulations must still be performed, but if the number of original tsunami sources assumed could be reduced to 20% of the original assumption, the computational cost of the numerical simulations could be correspondingly reduced to 20% of the original cost.

4 Probabilistic tsunami inundation assessment considering random slip and seawall failure

A probabilistic tsunami inundation assessment using the 20 tsunami sources selected was performed in Sect. 3, considering the randomness of the slip distribution of the tsunami sources as well as the uncertainty of the seawall failure, which is very important for the assessment of tsunami inundation depths on land.

4.1 Seawall setting for tsunami numerical simulations

The existence of 10 m high seawalls on the coast of the target area is assumed, and the uncertainty of seawall failure is considered. The previous section described tsunami numerical simulations performed using simple ground elevation data without considering any artificial structures, such as seawalls. This section additionally simulates the inundation distribution on land using nonlinear shallow water equations, assuming 10 m and 5 m high seawalls. That is, three cases of tsunami numerical simulations for a tsunami source are investigated: a 10 m seawall, a 5 m seawall and no seawall. Seawalls are simply set by modifying the elevation data of the coastal area. Figure 9 shows the tsunami numerical simulation results with and without setting seawalls. The inundation area behind the seawalls slightly decreases, and the inundation depth clearly decreases as the seawalls are installed and their height increases. The results for the 10 m seawall in Fig. 9(c) show that the maximum inundation depth along the river as well as behind the seawall is reduced since the tsunami flow into the river is reduced, which clearly indicates that the influence of the presence of seawalls cannot be ignored in assessing the inundation area of tsunamis. It should be noted, however, that the effect of seawalls is significant for tsunamis of smaller magnitude but limited for tsunamis of larger magnitude. This means that tsunamis of smaller magnitude are more likely to be obstructed by coastal seawalls, while tsunamis of larger magnitude will overtop them, making their presence of little relevance.

Fig. 9
figure 9

Changes in tsunami inundation depth distribution due to the difference in seawall installation. The red line indicates the location of the seawalls. (a) No seawalls, (b) 5 m tall seawalls and (c) 10 m tall seawalls

There are 20 cases of random tsunami sources for each Mw and 3 cases of seawall height; thus, it is necessary to calculate 60 cases for each Mw. As this needs to be calculated for all Mw values, it is necessary to calculate a total of 360 cases. If the number of assumed tsunami sources in each Mw is 100 cases, then a total of 1,800 cases must be calculated. Thus, the computational cost can be greatly reduced by reducing the number of tsunami sources by a factor of five.

4.2 Generating tsunami inundation distributions considering random slip and seawall failure

The inundation depth distribution is assessed considering the randomness of tsunami sources and the uncertainty of seawall failure. From Eq. 3, the column vector \({{\varvec{x}}}_{j}\) of inundation depths for a case \(j\) can be transformed as follows (e.g., Fukutani et al. 2021; Tozato et al. 2022):

$${\varvec{x}}_{j} = \sum\nolimits_{k = 1}^{N} {{\varvec{u}}_{k} \left( {\lambda_{k} v_{kj}^{T} } \right)} = \sum\nolimits_{k = 1}^{N} {\left( {\lambda_{k} v_{jk} } \right){\varvec{u}}_{k} } = \sum\nolimits_{k = 1}^{N} {\left( {\alpha_{jk} } \right){\varvec{u}}_{k} }$$
(11)

where \(N\) is the number of all modes and \({\alpha }_{jk}\) is expressed as the coefficient of the \(j\)th case for mode \(k\) multiplied by the singular value and the right-singular vector. Equation 11 demonstrates that the column vector \({{\varvec{x}}}_{j}\) of the inundation depth can be represented as a linear sum of the coefficients in each mode multiplied by the left-singular vector. \({{\varvec{u}}}_{k}\) is retained as computed by the SVD in Eq. 3 using the 60 numerical simulation results at each Mw. The coefficients \({\alpha }_{jk}\) are estimated for the parameters (random slip and seawall height) using the Bayesian estimation method with GPR. Using Bayesian inference, the predictive posterior distribution \({\mathbf{f}}_{\boldsymbol{*}}|\mathbf{f}\) that follows the prediction \({\mathbf{f}}_{\mathbf{*}}\) given the training function \(\mathbf{f}\) is

$$p({\mathbf{f}}_{\user2{*}} |{\mathbf{f}},\;{\varvec{X}}_{\user2{*}} ,\;\user2{ X}) = {\mathcal{N}}({\mathbf{f}}_{\user2{*}} |{\varvec{m}}_{\user2{*}} ,\;{\varvec{V}}_{\user2{*}} )$$
(12)
$${\varvec{m}}_{\user2{*}} = K\left( {{\varvec{x}},\;{\varvec{x}}_{\user2{*}} } \right)^{{\varvec{T}}} K\left( {{\varvec{x}},\;{\varvec{x}}} \right)^{ - 1} {\mathbf{f}}$$
(13)
$${\varvec{V}}_{\user2{*}} = K\left( {{\varvec{x}}_{\user2{*}} ,\;{\varvec{x}}_{\user2{*}} } \right) - K\left( {{\varvec{x}},\;{\varvec{x}}_{\user2{*}} } \right)^{{\varvec{T}}} K\left( {{\varvec{x}},\;{\varvec{x}}} \right)^{ - 1} K\left( {{\varvec{x}},\;{\varvec{x}}_{\user2{*}} } \right)$$
(14)

where, \({{\varvec{m}}}_{\boldsymbol{*}}\) is the expected value and \({{\varvec{V}}}_{\boldsymbol{*}}\) is the covariance function of the posterior distribution. \(K({\varvec{x}},{\varvec{x}})\) is a matrix of the Gaussian kernel. The distribution values \({\mathbf{f}}_{\boldsymbol{*}}\) can be generated from the joint posterior distribution by evaluating the mean and covariance function from Eqs. 12, 13, 14 and generating the samples accordingly. The details of the derivation of GPR have been provided in many past studies (e.g., Kuss 2006; Rasmussen and Williams 2006).

The coefficients \({\alpha }_{jk}\) corresponding to each mode were determined by using random numbers following the probability distribution of the input parameters (random slip and seawall height). First, the uncertainty of seawall failure is assessed by assigning a probability density to the height of the seawalls, meaning that the probability density that is set is the uncertainty of the seawall height, which encompasses not only the direct destruction of the seawall due to wave forces from the tsunami but also the physical destruction or collapse of the seawalls due to the strong seismic motion occurring before the tsunami as well as the weakening of the seawalls due to land subsidence and a reduction in the height of the seawalls. Specifically, the probability density for each 1.0 m of seawall height from 1.0 to 10 m is established, as shown in Fig. 10. A discrete probability density is established for the seawall height in this study, but a continuous probability density can also be established. The setting for a 10 m seawall height means that the original seawall will not be destroyed and will remain intact in the event of a megathrust earthquake and tsunami, while the setting of a 0 m seawall height suggests that the original seawall is completely destroyed and collapses. In this study, two different probability densities of seawall height are established to identify differences in the evaluation results in cases with different probability density settings for seawall height. Figure 10(a) shows an equal probability density setting for the height of the seawalls, while Fig. 10(b) shows a probability density weighted at the original 10 m height. Compared to the former, the latter setting involves a seawall that is less likely to be destroyed and is more resilient. There are no clear criteria for the probability density that should be established for any type of seawall. Thus, this is an issue for future research.

Fig. 10
figure 10

Probability density function for evaluating the uncertainty of seawall height in the event of a megathrust earthquake and tsunami generation. (a) An equal probability density for the seawall height. (b) A probability density weighted at the original 10 m

Random numbers that follow these probability densities were generated by using the inverse transform method to determine the height of the seawall in each Monte Carlo calculation. Together with random numbers corresponding to random tsunami sources, coefficients \({\alpha }_{jk}\) were generated based on the GPR results. Once \({\alpha }_{jk}\) is determined together with \({{\varvec{u}}}_{k}\), pseudo tsunami inundation depth distributions can be randomly generated using the surrogate model represented by Eq. 11.

4.3 Results and discussion

Figure 11 shows the estimation results for the coefficients \({\alpha }_{jk}\) corresponding to Modes 1 to 3. The parameters for \({\alpha }_{jk}\), the random tsunami sources and seawall height were normalized to a minimum value of 0.0 and a maximum value of 1.0. That is, for the random tsunami sources, the case with the largest area inundated by the tsunami source for the target area was set to 1.0, and the case with the smallest area was set to 0.0. For the seawall height, the 10 m height of seawalls was set to 1.0, and the 1.0 m height of seawalls was set to 0.0. The red dots show the coefficients derived from the 60 numerical tsunami simulations and SVD, and the coloured surfaces show the regression results using GPR. The parameters used in GPR were set as kernel length parameter \(1/\sqrt{2}\), kernel vertical variation parameter 1.0, and noise parameter 0.0001. Noise was set to almost zero. The coloured surfaces are a reasonably good representation of the data obtained from the tsunami numerical calculations and SVD.

Fig. 11
figure 11

The coefficients \({\alpha }_{jk}\) corresponding to (a) Mode 1 to (c) Mode 3. The horizontal axis indicates the standardized parameters for the random tsunami sources and the seawall height. The red dots denote the coefficients \({\alpha }_{jk}\) derived from the tsunami numerical simulations. The colour surfaces indicate the results of Gaussian process regression (GPR)

Generating random numbers in the range of 0.0 to 1.0 for the two parameters allows the value of \({\alpha }_{jk}\) to be estimated based on the regression results from the GPR, and this value can be substituted it into Eq. 11, allowing any tsunami inundation depth distribution other than the numerically obtained tsunami inundation distribution to be estimated. Figure 12 shows the Monte Carlo samples of only six generated inundation depth distributions. The figures show that all of them can represent realistic inundation depth distributions.

Fig. 12
figure 12

The Monte Carlo samples of only six generated inundation depth distributions using Eq. 11 for the target area

5 Probabilistic tsunami inundation assessment

Finally, a probabilistic tsunami inundation assessment is performed using many inundation depth distributions generated by considering the randomness of tsunami sources and the uncertainty of seawall failure, which was assessed in the previous section.

5.1 Time-dependent occurrence probability model

It is necessary to consider the generation probability of the target earthquake to evaluate probabilistic tsunami inundation depth distributions due to earthquake-induced tsunamis. The target of this study, the Nankai Trough megathrust earthquake, has been considered a characteristic earthquake, and its interval of occurrence has been specified by the Earthquake Research Committee (2013). Therefore, the following probabilistic time-dependent generation model, the BPT distribution, is used as the probability of occurrence of the Nankai Trough megathrust earthquake:

$$P\left( t \right) = \sqrt {\frac{\mu }{{2\pi \alpha^{2} t^{3} }}} {\text{exp}}\left( { - \frac{{\left( {t - \mu } \right)^{2} }}{{2\mu \alpha^{2} t}}} \right)$$
(15)

where, \(t\) is the elapsed time since the last earthquake; \(\mu\) and \(\alpha\) are the first- and second-order parameters of the distribution, respectively; and \(\mu\) is defined as the mean interval between active years for the earthquake.

The BPT distribution for the Nankai Trough megathrust earthquake is applied with a mean value \(\mu\) of 88.2 years and an α value of 0.24 for the variability (Earthquake Research Committee 2013), and random numbers are generated to determine whether the earthquake can occur every year. Then, the Mw of any earthquake predicted to occur is determined based on the Gutenberg-Richter law (Gutenberg and Richter 1944). Monte Carlo simulations are performed to repeat this process, and the results of the probability distribution of tsunami inundation depths greater than 1.0 m are presented for the next 50 years and for the next 10 years.

5.2 Results and discussion

Figure 13 shows the results of the evaluation for the probability of the inundation depth distribution being greater than 1.0 m in the next 50 years (Fig. 13(a) and (b)) and the next 10 years (Fig. 13(c) and (d)), setting the probability density of the seawall height shown in Fig. 10. The figure on the left is the equal probability density setting shown in Fig. 10(a), and the figure on the right is the probability density setting weighted at the original 10 m shown in Fig. 10(b). Not surprisingly, the results of the assessment covering the next 50 years show a wider overall inundation distribution and a higher probability of tsunami inundation than the results of the next 10 years. The results for the next 50 years show that there is a nearly 100% probability that the entire coastal area will be at least 1.0 m, while the results for the next 10 years show a probability of 40% at best. The mean value of the BPT distribution for the Nankai Trough earthquakes is 88.2 years, and as of 2022, 76 years have passed since the latest earthquake, the 1946 Showa Nankai earthquake (e.g., Tanioka and Satake 2001); thus, the next Nankai Trough earthquake is imminent from a probabilistic point of view. Therefore, the probability of inundation is assessed to be several tens of percent over a wide area in the target area even if only for a short period of time, such as the next 10 years. Additionally, the figure clearly shows that the inundated area due to tsunamis is reduced when we set the seawalls with the resilience setting shown in Fig. 10(b). In lowland areas along the river and behind seawalls, the inundation probability decreased by approximately 20–30%. Therefore, these findings show that the tsunami inundation depth distribution can be probabilistically assessed considering the randomness of the tsunami sources and the uncertainty of seawall failure by using the proposed method.

Fig. 13
figure 13

Tsunami inundation probability distribution for inundation to a depth of 1.0 m or more (a) in the next 50 years with the failure degree probability of the seawall in Fig. 10(a), (b) in the next 50 years with the failure degree probability of the seawall in Fig. 10(b), (c) in the next 10 years with the failure degree probability of the seawall in Fig. 10(a), and (d) in the next 10 years with the failure degree probability of the seawall in Fig. 10(b)

6 Conclusion

This study proposes a PTHA method considering the randomness of the slip distribution of tsunami sources and the uncertainty of the seawall height during earthquake and tsunami events, which have a significant impact on the uncertainty of the evaluation of tsunami inundation depth. There are two main innovations resulting from this study. First, the computational cost for PTHA is reduced. By applying the POD technique to tsunami inundation depth distributions obtained from tsunami sources with random slip distributions, the number of tsunami sources to be considered was successfully reduced to 20% of the original number. Second, this is the first study to carry out a PTHA that considers the uncertainties in seawall height. When assessing tsunami inundation depths on land, the failure degree of seawalls has a significant impact, and this impact is considered in the present study by treating the failure degree of seawalls probabilistically and incorporating it into the tsunami inundation assessment model. Although the randomness of earthquake slip distributions has been studied extensively in the past in the research field of PTHA, there has been no assessment model that can simultaneously consider the failure degree of seawalls. Finally, the present study successfully generated an inundation distribution map that represents the probability of the occurrence of inundation depths within the next 10 years and 50 years, considering the randomness of the earthquake slip distributions and the uncertainty of the seawall height at the time of the earthquake and tsunami. Using this methodology, probabilistic hazard maps can be efficiently produced in each region, which is the first step for tsunami disaster countermeasures and may help to promote various countermeasures.

Finally, the future research tasks are described. This study assessed the uncertainty of seawall height assuming only two cases, but in actual application, the probability density of seawall height should be set according to the situation of the seawalls. Currently, there are no clear criteria for what probability density should be set for any type of seawall; thus, this is an issue for the future. In addition, there are many parameters that affect tsunami inundation depths on land during the generation, propagation, and runup phases of a tsunami. In the tsunami generation process; the location and geometry of the earthquake fault (fault parameters); the starting point, propagation velocity and rise time of the fault rupture; and the calculation method of the initial water level affect the simulation results of tsunami hazards. Fault parameters that determine the location and geometry of an earthquake fault include the slip amount, fault depth, slip angle (rake), strike, and dip angle; thus, spatiotemporal changes in all these parameters affect the results. In the tsunami propagation process, governing equations for the tsunami phenomena, tide level, and bottom topography data affect the results. In the tsunami runup process, setting artificial structures (buildings and seawalls) and bottom roughness data affect the results. However, the present study dealt only will uncertainties arising from the randomness of the earthquake slip distributions during the tsunami generation phase and the uncertainty of seawall heights in the tsunami runup process; future analyses should consider the variabilities in many other parameters. Since POD and GPR are multivariate applicable, future analyses can consider the variabilities in several other parameters involved in tsunami inundation hazard assessment.