1 Introduction

Over the last decade, several tsunami events, such as those of Maule in 2010 and Tohoku in 2011, have further demonstrated the catastrophic and widespread potential for death and destruction inherent in tsunami waves and, consequently, the need to improve the reliability of tsunami early warning systems. The 2010 Maule earthquake (\(M_{\mathrm{w}} 8.8\)) generated a tsunami that caused severe damage and loss of life in coastal communities (e.g., Fritz et al. 2011) and highlighted the consequences of an ineffective alert (Soulé 2014). Events such as this have reaffirmed the importance of improving timely evacuation warnings, which are considered to be one of the most effective ways to reduce loss of human life and damage to coastal communities (Okal 2015).

The goal of a tsunami warning system is to deliver a timely and meaningful evaluation of the hazard to authorities and to the population at large, with the main objective of triggering evacuation. While the role of education is usually considered the cornerstone for successful responses, the role of accurate information regarding the actual hazard is also relevant (Okal 2015; Bernard and Titov 2015). Over the years, improving tsunami hazard assessment has followed different approaches, all attempting to maximize the lead time of the warning relative to tsunami arrival.

For far-field tsunami forecasting, where the coastal tsunami impact can be quantitatively evaluated well in advance of tsunami arrival, the approach is to combine monitoring of the actual tsunami along its propagation path with numerical simulations, and take advantage of this information to estimate the hazard; For instance, the USA has a worldwide network of offshore tsunami observatories, which are located near several subduction zones at distances equivalent to 30–60 min tsunami travel time from expected tsunamigenic earthquake sources. Data are used as input to inversion procedures to obtain an estimate of the earthquake or tsunami source (Percival et al. 2011), which allows forecasting a set of tsunami hazard products (Bernard and Titov 2015).

For near-field tsunami forecasting, on the other hand, the lapse between tsunami generation and arrival may be too short to perform tsunami source evaluations and forward simulations. Williamson and Newman (2019) show that, in some places, first arrival can occur in as little as 5 min. In the field, Aránguiz et al. (2016) reported arrivals in less than 8 min for the \(M_\mathrm{w}\; 8.4\) Illapel event, based on anecdotal evidence. A relevant aspect of the problem is the finite time required to acquire enough data to allow for source inversion. Owing to the significantly different propagation speeds of tsunami and seismic waves, the preferred approach is to rely on seismic data alone. Although some seismic source solutions can be obtained in relatively short time, they are subject to uncertainties that can affect the accuracy of tsunami estimates, hence limiting their applicability for tsunami early warning purposes (Cienfuegos et al. 2018). In addition, computing tsunami propagation and inundation has a high computational cost. Consequently, to date, most operating tsunami early warning systems use table look-up procedures on datasets of simplified seismic scenarios previously calculated and stored in a database (e.g. Gica et al. 2008; Kamigaichi et al. 2009). This approach is used in countries such as Japan, Australia, and Indonesia, and recently, in Chile. This approach is affected by the limited accuracy of such assessments owing to the likely mismatch between the simple stored scenarios and the actual event (Behrens et al. 2010)

It has been noted that including tsunami data in the inversion process usually leads to improved results in estimating the tsunami hazard (e.g., Behrens et al. 2010; Gusman et al. 2014; Cienfuegos et al. 2018). Hence, it is highly desirable to incorporate tsunami data as early as possible. These observations can be acquired from tide gage data (Satake 1987; Satake and Kanamori 1991), satellite altimetry (Arcas and Titov 2006; Hirata et al. 2006), deep-ocean tsunameters (Titov 2009; Percival et al. 2011; Rabinovich and Eblé 2015), and cabled ocean-bottom pressure sensors (Baba et al. 2004; Tsushima et al. 2009, 2012). Among these, the use of coastal stations (tide gages) is typically not considered for early warning purposes, owing to the nil lead time and possible influence of coastal bathymetry on the hydrodynamics. Hence, offshore tsunami data from tsunameters such as OBPSs seem to be most appropriate for tsunami early warning systems.

Williamson and Newman (2019) analyzed the possible coastal locations where sensors can provide a meaningful lead time for near-field tsunamis. These locations concentrate along narrow bands that run roughly parallel to subduction zones. In addition, one of the most important factors for tsunami forecasting based on offshore tsunami data is the configuration of the array of tsunami stations. The accuracy of the reconstruction of the tsunami source strongly depends on the azimuthal coverage of observation stations with respect to the source area, which is improved when the sensors are located close to the main beam of tsunami energy (e.g., Pires and Miranda 2001; Piatanesi et al. 2001; Bernard et al. 2001). In addition, a large number of sensors enables better resolution of the finer structure of the tsunami source. For instance, Japan has a few submarine cabled seafloor observatory networks (e.g., the S-net, DONET1, and DONET2 systems) that provide data in real time. Tens of bottom pressure sensors have been installed or are planned to be installed (e.g., Kaneda et al. 2015; Kawaguchi et al. 2015) at an estimated cost of US$ 500M, according to Bernard and Titov (2015). Maintenance and operational costs also need to be accounted for, in addition to installation costs. This high cost may pose a significant hurdle for developing countries along subduction zones, such as Chile or other countries on the eastern Pacific seaboard, where there is an extensive seismogenic zone. To attempt to overcome this, it is valuable to study whether arrays with fewer sensors could provide a working solution at lower cost.

The number and placement of tsunameters have been based on expert judgment, considering technical aspects such as the variability and location of seismic sources (Schindelé et al. 2008), travel time (Schindelé et al. 2008; Omira et al. 2009; Williamson and Newman 2019), financial factors, for instance pertaining to installation or operation costs (Araki et al. 2008), and legal aspects, such as geographic boundaries or legal jurisdictions (Abe and Imamura 2013), among others. Regarding the construction of a sensor array, prior research indicates that two to four observation stations are capable of constraining the source parameters quite well if the stations are optimally located relative to the main tsunami energy beam, whereas adding more data does not significantly improve the inversion results (e.g., Percival et al. 2011; An et al. 2018). For example, Bernard et al. (2001) suggested that sensor spacing of about 175–700 km in the along-strike direction is required to characterize the main energy beam of \(M_\mathrm{w} \sim 8.0\) events using just three sensors. On the other hand, the relative distance between sensors and between sensors and the source will be constrained by the tsunami travel time and the duration of the tsunami record to be used in the inversion (e.g., Bernard et al. 2001). Hence, given an earthquake, it is possible to define an area (henceforth termed the influence area \(A_l\)) over which the tsunami waves have already propagated away from the source and allow for sufficient data to be collected (e.g., Williamson and Newman 2019). A minimum of two sensors must be considered inside this area. This is equivalent to defining a data observation time \(T_0\), if the tsunami propagation speed is known. For the present application, \(T_0\) is user defined, thereby limiting the areal coverage where sensors can be placed and how much data is used in the inversion.

Once possible locations of the sensors have been established, evaluations of the best configuration have adopted different approaches. Both Schindelé et al. (2008) and Omira et al. (2009) took into account local seismicity to identify possible sources, and designed the optimal placement of sensors using the travel time and a set of delays as the target function. They differ in that Schindelé et al. (2008) used 16 evenly spaced tsunami sources spanning a long stretch of coast of the Western Mediterranean Basin, which were used to test the efficacy of two predefined arrays. Thirteen tsunameters spaced at about 50–90 km yielded the best results. Instead, Omira et al. (2009) used just a single scenario in each of five tsunamigenic zones, though all were encompassed within a small domain, resulting in a common array of just three sensors. On the other hand, Spillane et al. (2008) applied an optimization approach to place Deep-ocean Assessment and Reporting of Tsunami (DART) buoys in the Aleutians Islands and Alaska regions. Results suggested that arrival time is the main restriction on tsunameter placement; it was found that adding more than three sensors did not improve the results significantly. Mulia et al. (2017) also used optimization methods, specifically a dimensionality reduction approach, to initiate the process. Unlike the previous cases where the goal was to address the performance for a wide range of sources, the goal of Mulia et al. (2017a, 2017b) was to identify the best placing of sensors to characterize a specific, large-magnitude, target scenario. Their focus was to resolve in great detail the characteristics of nonuniform slip by maximizing the accuracy of inverting a set of stochastic scenarios on a predefined spatial domain. Inverted sources were compared against each stochastic source, under the assumption that the tsunami would be well determined if the earthquake source was well retrieved. Hence, no evaluation of the tsunamis were performed. An initial set of 30 sensors was obtained, which was reduced to 23 sensors at specific locations after optimization. This highlights that, in pursuing the detailed spatial distribution of slip, a large number of sensors is required. Recently, Saunders (2018) proposed an improvement in slip characterization for the Cascadia Subduction Zone, testing five different dense sensor arrays, which coupled Global Navigation Satellite System (GNSS) sensors with tsunameters. To assess accuracy, they compared the root mean square of the difference between the inverted and input fault slip, maximum fault slip, tsunami amplitude at the coast, and percentage of coastline hit by high-amplitudes waves between the results recovered from the inversion and the input data.

Note that the performance of sensor networks has typically been quantified by analyzing their accuracy in predicting the arrival time (Bernard et al. 2001; Schindelé et al. 2008; Omira et al. 2009), the source slip (Mulia et al. 2017; Saunders 2018), or amplitude at the coast (Mulia et al. 2017b; Saunders 2018), among others. Typically, these parameters are analyzed independently. In this work, a methodology for estimating the optimal placement of a small numbers of sensors is used and tested. While similar in scope to previous studies (e.g., Schindelé et al. 2008; Omira et al. 2009), here other tsunami parameters are included in the assessment and combined into a single cost function to enable an objective comparison. In particular, three different tsunami parameters are considered in unison to find the optimal configuration.

2 Methodology

The overall objective of this work is to determine an optimal array configuration of offshore tsunami sensors for near-field tsunami forecasting, to be used as a data source for a tsunami inversion technique. To this end, a three-step process is applied. The first step consists of the implementation of the inversion procedure. Next, a number of different sensor arrays are used to invert a set of two target scenarios. Finally, the results of these source inversion tests are objectively compared by defining performance indicators. The analysis focuses on maximizing the forecast accuracy of three relevant tsunami parameters: arrival time, maximum tsunami amplitude, and forecast skill.

Fig. 1
figure 1

a, b Sea surface deformation for the scenarios considered. Stars indicate epicenters. Black squares show virtual observation points offshore, red triangles indicate forecasting points, and the yellow triangle represents the fixed sensor. a\(M_\mathrm{w}\) 8.3 and b\(M_\mathrm{w}\) 8.5 scenarios, located at the northern and southern end of the Northern Chile Gap, respectively. c, d Corresponding maps of tsunami wave height, considered a proxy for tsunami energy. c, d Final sensor configuration shown by yellow triangles

2.1 Implementation of Inversion Algorithm

The method is built on the premise that, to determine the tsunami source, an inversion procedure must be implemented. Here, the tsunami Forecasting based on Inversion for initial sea-Surface Height (tFISH) method (see Tsushima et al. 2009 for details) is used, by which tsunami waveform data are inverted (in real time in an operational setting) to estimate the initial distribution of sea-surface displacement. To this end, a set of unit tsunami sources are propagated to observation points to obtain their Green’s functions, which are later used to perform the inversion and forecast the tsunami time series at target locations.

The initial sea surface model for each unit source corresponds to a Gaussian function. Here, a set of nearly 1000 unit sources is considered, each having dimensions of \(700 \times 700\) arcsec, with their centers spaced by 0.15 arcdeg, thereby overlapping to allow for smooth variation of the sea-surface displacement using a finite number of discrete elements (Aida 1972). They cover an area spanning about \(7^\circ \times 3^\circ \) (latitude, longitude, approximately \(680 \times 160\) km) that coincides with the Northern Chile Gap (Comte and Pardo 1991; Metois et al. 2013), as shown in Fig. 1a, b. In what follows, other data sources such as DART buoys are not considered in the analysis, under the premise that a completely new system is being developed in the area of interest, even though they may be of benefit to the inversion process. Additionally, for the particular case of Chile, the location of existing DART buoys in the area of interest requires observation times longer than those studied herein (Williamson and Newman 2019).

All Green’s functions were estimated in advance considering a prescribed duration of the event, T, using the tsunami model JAGURS (Baba et al. 2016). This is a parallelized numerical model that can solve the nonlinear Boussinesq dispersive equations in spherical coordinates with a leapfrog staggered-grid, finite-difference calculation scheme. However, for the present case, nonlinear terms were not considered. Bathymetry data were obtained from the GEBCO_2014 grid (version 20150318, www.gebco.net data) with spatial resolution of 30 arcsec (Weatherall et al. 2015). Data were stored in a database, from which appropriate tsunami free surface time series, \(\eta (t)\), were extracted and stored for use in the inversion and forward calculations. These were collected at 212 offshore observation points, identified in Fig. 1a, b by black squares. Each of these denotes the possible location of a single observing station or sensor, and a collection of such sensors is termed a sensor array. Time series were also stored at seven coastal observation points located close to existing tide gages operated by the National Hydrographic and Oceanic Service of the Chilean Navy (SHOA, from its acronym in Spanish). The latter are used to evaluate the predictive performance at the coast (identified in Fig. 1a, b by red triangles). Typically, these are located at absolute depths greater than 200 m such than nonlinear effects can be neglected and linear superposition of tsunami time series can be performed with minimal errors. It is worth noting that tFISH considers the coseismic displacement in the inversion procedure, thereby allowing placement of sensors in the seismogenic zone. More details about tFISH can be found in Tsushima et al. (2009, 2012).

2.2 Numerical Experiments

For the purpose of testing and evaluating the performance of different sensor arrays, the accuracy of the inversion was evaluated using a set of prescribed tsunami sources, termed design scenarios. While it is possible to test a large set of sources over a large domain, as in Schindelé et al. (2008), here the focus is on using sources located at the extremes of the area of interest. The assumption is that these will correspond to the most demanding configuration for the sensor arrays owing to their relative location to the main energy beam. A large number of sensor arrays are compared to identify the array that provides the best performance.

Two synthetic tsunamigenic earthquakes, with magnitudes \(M_\mathrm{w}\) 8.3 and 8.5, respectively, are considered as the design scenarios. These were determined by Cienfuegos et al. (2014) as representative events for the Northern Chile Gap, based on the interseismic coupling model of Chlieh et al. (2011), information on interseismic slip rates, and convergence models. The northernmost design scenario is located just offshore of Arica, near the Chile–Peru border; whereas the southern design scenario is located just north of Mejillones Peninsula, as seen in Fig. 1. It is expected that the resulting network will be able to identify earthquakes spanning a section of coast about 680 km long. For reference, these scenarios flank the rupture area of the 2014 Pisagua earthquake.

To carry out the analysis, each design scenario is propagated forward to all forecast points, using the linear shallow water equation model COMCOT (Wang 2009) with the same grid as in JAGURS, thus providing observed time series, \(\eta _{\mathrm{obs}}(t)\). Those recorded at offshore sensors are considered as target time series to be used in the inversion process, whereas coastal time series are used to assess the performance of the solution. The use of a different tsunami model in propagating the target scenario and in preparing the database of Green’s functions is implemented to reduce possible overfitting in the inversion. In addition, it is assumed that other sources of noise usually present in deep-ocean data, such as tides and seismic noise such as the recording of Rayleigh waves (e.g., Webb 1998), have been removed during preprocessing using detiding, or low- and high-pass filters, as described for instance in Rabinovich and Eblé (2015); For example, Tsushima et al. (2012) consider a 60-s moving average to process data for tFISH. Here, since synthetic time series are used, these effects are considered to have been filtered out already, as also done by Mulia et al. (2017). However, it is still possible for spurious signals to contaminate the time series and affect the inversion procedure. To simulate this, Gaussian noise with a maximum amplitude of 10% of the variance of the noise-free target tsunami time series is added (e.g., Romano et al. 2016; Mulia et al. 2017).

The relative position of each sensor with respect to the tsunami scenarios determines the tsunami arrival time at the sensor but also the amount of data usable in the inversion. In existing inversion procedures, each sensor uses a different observation time, sufficient to gather at least a quarter of the initial tsunami waveform. Here, the observation time \(T_0\) is defined so as to determine an area such that any sensor located inside this area could record at least half a tsunami wavelength. This defines the influence area, \(A_l\). Note that the observation time is also restricted by the tsunami arrival to the coast, in order to provide sufficient lead time for an eventual warning. As a starting point, \(T_0=10\) min is used, consistent with the observed tsunami arrival of the 2014 Pisagua tsunami (Catalán et al. 2015).

Considering the time restriction imposed, there is a relatively small region where the influence areas \(A_l\) of both scenarios overlap. Hence, it is possible to place a sensor that would serve both the northern and southern sections, located in the outer rise offshore of Iquique (see yellow triangle in Fig. 1). While the selection of this sensor was arbitrary, the relatively small number of alternatives led to no significant sensitivity to its selection (not shown). To find the location of the other sensors, the influence area for each scenario was discretized every 0.25 arcdeg (\(\sim \) 30 km) in both latitude and longitude to define a grid of possible sensor locations. This spacing is significantly smaller than the tsunami wavelength in the area. In addition, these offshore observation points are located at depths large enough for tsunami nonlinear effects to be considered negligible, thereby ensuring consistency with the assumptions of the inversion algorithm. As a result, 99 and 113 possible offshore observation points were defined in the northern and southern parts, respectively. A sensor array was defined by pairing the common sensor and each of the possible observation points. Note that potential observation points are located onshore and offshore of the trench. While this placement could be subject to further operational restrictions such as deployment cost, these are not considered herein, since the aim is to assess the methodology to select the optimal array. The difference in the number of observation points among the north and south regions is due to the difference in the source dimensions, and tsunami celerity arising from the bathymetry, which control the extent of \(A_l\).

2.3 Performance Assessment

For each of the 212 sensor arrays (99 and 113, for each scenario), the observed tsunami time series are used to invert the tsunami source. Once a source is determined, coastal tsunami time series are obtained by linearly combining precomputed tsunami Green’s functions weighted by the resulting initial distribution, and identified as forecasted data, \(\eta _{\mathrm{for}}(t)\). The use of Green’s functions is preferred over directly modeling the inverted events (as done by Schindelé et al. 2008, for instance) because it allows for testing and comparing the solutions from a large number of sensor array configurations at low computational cost.

To assess the performance of each sensor array, the observed (from the full tsunami simulations using COMCOT) and the forecasted (from the linear combination using inverted source solutions) data are compared. Specifically, three tsunami parameters are considered. The first is the tsunami arrival time (e.g., Schindelé et al. 2008; Omira et al. 2009). This is considered to be an essential parameter for tsunami warning system frameworks to provide a timely hazard assessment. However, the definition of arrival time is relatively loose and could refer to different stages of the tsunami, such as the first exceedance of a threshold, the first local crest, first initial trough of N-waves, and others (e.g., Hayashi et al. 2011). Here, two different definitions are considered in order to make the analysis more robust. The first defines arrival time as the time \(T_1\) at which the free surface first exceeds a certain arbitrary threshold, \(l_1\)

$$\begin{aligned} T_1(\eta ) = \text {min}(t \in (0,T) \mid \eta (t) > l_1). \end{aligned}$$
(1)

This arrival time definition is also used by the German–Indonesian Tsunami Early Warning System (GITEWS, Rakowsky et al. 2013), and for the present implementation a threshold of \(l_1=0.05\) m is defined. However, considering that in some cases the tsunami time series might not exceed this prescribed threshold, a second arrival time is defined as the time when a proxy for the slope of the free surface exceeds the threshold \(l_2\)

$$\begin{aligned} T_2(\eta ) = \text {min}\left( t \in (0,T) \mid \frac{\partial \eta }{\partial t} > l_2\right) . \end{aligned}$$
(2)

The aim is to establish a measure of the rate of change of the tsunami signal as an early proxy for the first local maximum. In this case, the value chosen was \(l_2=0.002\). In doing this, it is assumed that the actual free surface slope is proportional to the rate of change as per the long-wave approximation

$$\begin{aligned} \frac{\partial \eta }{\partial x} \propto \frac{\partial \eta }{\partial t}. \end{aligned}$$
(3)

The accuracy in predicting the possible impact and magnitude of the tsunami is also relevant for a tsunami warning system. Another relevant parameter is the maximum tsunami amplitude (here denoted as H), which is the parameter used to categorize the hazard in most existing tsunami early warning systems. The maximum tsunami amplitude is estimated as

$$\begin{aligned} H={\text {max}}(\eta (t)-\eta (0) \mid t \in (0,T)). \end{aligned}$$
(4)

Both the arrival time and amplitude are statistical parameters that are estimated independently for observed and forecasted free surface time series, then compared. However, neither considers the accuracy in retrieving the shape of the waveform. Hence, a skill estimator (\(S_k\)) is computed, as this index is commonly used to evaluate model accuracy (e.g., Hampson et al. 2011)

$$\begin{aligned} S_k=\sqrt{\frac{\sum _{i=1}^{n}\left( \eta _{\mathrm{for}}-\eta _{\mathrm{obs}} \right) ^2}{\sum _{i=1}^{n}\left( \eta _{\mathrm{obs}} \right) ^2}}, \end{aligned}$$
(5)

where n corresponds to the number of time steps of the time series.

In assessing the accuracy, the error between observed and forecasted quantities is estimated for arrival times and maximum tsunami amplitudes. However, one possible difficulty in establishing a standard metric is that each of these parameters has its own scale with significantly different ranges of values; For example, while an error in arrival time of a few minutes can be considered acceptable (for instance, less than 2 min), a variation in height of more than 1 m can signify a large error and change the hazard category. To provide a common basis of comparison for all possible sensor configurations, the error of each parameter is nondimensionalized by dividing by the reference provided by the observed data. In addition, it is possible that one parameter having a large error could bias the combined assessment to be implemented. Therefore, the error estimated is capped under the assumption that errors larger than the observed value will be treated as equally significant. This is implemented as follows:

$$\begin{aligned} \varDelta T_\mathrm{m}= & {} \text {min} \left\{ \left| \frac{T_\mathrm{m}(\eta _{\rm obs})-T_\mathrm{m}(\eta _{\mathrm{for}})}{T_\mathrm{m}(\eta _{\mathrm{obs}})} \right| ; 1 \right\} , \end{aligned}$$
(6)
$$\begin{aligned} \varDelta H= \text {min}\left\{ \left| \frac{H(\eta _{\mathrm{obs}})-H(\eta _{\mathrm{for}})}{H(\eta _{\mathrm{obs}})-\eta _{\mathrm{obs}}(0)} \right| ; 1 \right\} , \end{aligned}$$
(7)
$$\begin{aligned} S_k= \text {min}\left\{ S_k ; 1\right\} , \end{aligned}$$
(8)

where \(m=1,2\) applies to the different time travel parameters \(\varDelta T_1\) and \(\varDelta T_2\). In order to couple both definitions of arrival times, the error estimator associated with this parameter was considered as the average of each percentage error, given by

$$\begin{aligned} \varDelta T=\frac{\varDelta T_1+\varDelta T_2}{2}. \end{aligned}$$
(9)

By introducing the saturation value, errors \(\varDelta T_\mathrm{m}\) or \(\varDelta H\) exceeding 100% are not penalized in excess and do not bias the overall error, allowing identification of the importance of the other parameters in the comparison. In the case of the skill, \(S_k=1\) indicates that the magnitude of the error is comparable to or greater than the observed values and zero values mean a perfect fit for the indicator. In addition, whether each quantity is under- or overestimated is not considered relevant, and absolute values are used instead.

It is proposed that, instead of analyzing these metrics independently, it is desirable to identify the sensor configuration that yields the minimum combined error. However, it might be relevant for the user to give preference to one of the metrics above the others. Hence, to combine all estimators into a single quantitative estimate, a forecasting accuracy function is introduced

$$\begin{aligned} F_{i,j}(\varDelta T, \varDelta H, S_k)=\alpha \varDelta T_{i,j} + \beta \varDelta H_{i,j} +\gamma S_k{i,j}, \end{aligned}$$
(10)

which quantifies the total error of the estimation at any coastal forecasting point j given an offshore array i. Here, \(\alpha ,\;\beta \) and \(\gamma \) are weights that allow for user-defined tuning of the relative importance of each parameter. The sum of the weights should be 1, in order to preserve the comparison basis. In this way, it is possible to quantitatively compare the performance of all sensor arrays at any given forecast point. A conceptually similar approach was used by Behrens et al. (2010), although their objective was to evaluate the benefit of incorporating different sensors in an inversion procedure, rather than finding the best placement of them.

It is possible that comparisons at a single coastal point might also be subject to bias. To assess the overall predictive capability, the aggregated performance is computed by adding up the individual results at forecast points of interest, given a sensor array i

$$\begin{aligned} \text {EG}_i = \frac{1}{N}\sum _{j=1}^{N}F_{i,j}(\varDelta T, \varDelta H, S_k), \end{aligned}$$
(11)

where N is the number of forecast points chosen. Upon calculating the global error by means of Eq. 11 considering all possible offshore arrays, the candidate array is selected as that having the minimal global error \((\text {min}(\text {EG}_i))\). As in the case of Eq. 10, it is possible for Eq. 11 to be biased by a few forecast points. To this end, is required that each forecasting point j of the array should have an accuracy function value smaller than a threshold (\(F_{i,j}(\varDelta T, \varDelta H, S_k) < \mu , \;\; \forall j\)). This filter is equivalent to ensuring a minimal forecast capacity at each forecasting point j, and is applied after all sensor arrays have been compared.

Fig. 2
figure 2

Spatial distribution of individual error estimates as a function of sensor arrays in the northern section. a\(\varDelta T\) (Eq. 9); b\(\varDelta H\) (Eq. 7); c\(S_k\) (Eq. 8); and d total forecasting accuracy function, \(F_{i,j}\) (Eq. 10) estimated using as reference the coastal forecast point at Arica. Star indicates the scenario epicenter, the thin black line is the location of the trench, the yellow triangle denotes the location of the fixed sensor, and each grid point in the color map corresponds to a tested sensor

3 Results

In Figs. 2 and 3, the accuracy of the estimation is presented as a space map of the value of each of the error estimates defined by Eqs. 610, presented as a color scale, where smaller values (red colors) indicate better accuracy. Each grid point corresponds to the location of the second sensor of the array for each scenario, which is defined as the “tested sensor,” while the first is kept fixed at the location of the yellow triangle. The results shown correspond to the assessment obtained when using the observed time series at Arica and Patache as the coastal forecast point j of interest, respectively. For the purposes of demonstration, the results analyzed correspond to the case when \(\alpha =\beta =0.4\) and \(\gamma =0.2\).

Fig. 3
figure 3

Same key as Fig. 2, using as reference the coastal forecast point at Patache

The errors in arrival time, \(\varDelta T\) (Figs. 2a, 3a), are lower near the main energy beam, as shown by the warmer colors. Sensor locations close to the coast and the forecast point also show good performance. Arica and Patache show a distinct behavior, where Arica is more sensitive to the sensor location, with an error of \(\varDelta T\sim 0.36\) on average. Patache shows less sensitivity and better accuracy overall, partly because it benefits from the fixed sensor being located in front of it, making the results less sensitive to the placement of the secondary sensor.

The error in amplitude, \(\varDelta H\), is shown in Figs. 2b and 3b. Unlike the case of the arrival time, the performance at Arica and Patache is similar, suggesting that the tested sensor dominates over the fixed one. The spatial distribution of the error \(\varDelta H\) is related to the directivity of the tsunami energy radiation. When the tsunami source has an aspect ratio (length to width) greater than 1, most of the tsunami energy is radiated perpendicular to the major axis of the source (Kajiura 1970, 1972). Consequently, errors smaller than 20% are found in some cases where the sensor is located close to the main energy beam (see Fig. 1c, d for reference on the energy beams). In contrast, when the sensor is located parallel to the major axis, where weaker amplitudes are radiated, the observed tsunami time series carries less information about the maximum tsunami energy, resulting in the initial sea surface being underestimated. This also leads to underestimation of the amplitude at forecast points. Thus, when sensors are located outside the main energy beam, accuracy decreases rapidly and reaches the saturation limit. The location of the fixed sensor does not allow for improved inversions, as it lies outside the main energy beam, explaining the similar performance at both sites. This marked difference in performance between the amplitude and arrival time errors highlights the need to also include the amplitude as a relevant tsunami parameter.

Similar conclusions are obtained from the skill indicator \(S_k\) (Figs. 2c, 3c), where higher forecasting accuracy is obtained for sensor arrays with at least one sensor located near the main energy beam. The loss of accuracy is not as well defined as with the amplitude, yet the minimum skill is close to \(S_k\sim 0.35\) for Patache. Therefore, the skill is a more demanding parameter overall, but does not lead to a clear distinction among sensor arrays. The combination of these individual errors in the forecasting accuracy function, \(F_{i,j}\) follows the same trend (Figs. 2d, 3d). The minimum global error is found in the area offshore of Arica and is influenced by the amplitude error. It aggregates the structure of the error in arrival time, arguably owing to the choice of weights being analyzed. On the other hand, at Patache, the global error distribution shows less contrast between locations than at Arica. These results reinforce the idea that a single error estimate such as arrival time does not suffice to identify the best placement, but it also shows that using a single forecast point as a reference can be affected by local dependencies. It is important to note that there is a smooth transition from lesser-quality results (cool colors) to good-quality results (warm colors) for all estimators, which means that the spatial discretization used in the sensor placement suffices to capture the error dependencies.

Fig. 4
figure 4

Space maps of global error, \(\text {EG}_i\). Star indicates scenario epicenters, and yellow triangles show selected sensor array. a\(\text {EG}_i\) for the northern scenario, estimated using Arica, Pisagua, Iquique, and Patache. b\(\text {EG}_i\) for the southern scenario, estimated using Patache, Tocopilla, Mejillones, and Antofagasta

The aggregate of the forecasts at coastal points is estimated using Eq. 11, considering four coastal points (\(N=4\) in Eq. 11), namely Arica, Pisagua, Iquique, and Patache, for the northern design scenario, and Patache, Tocopilla, Mejillones, and Antofagasta for the southern case. The choice is due to their proximity to either source and because they have similar arrival times. Figure 4 shows the distribution of global error \(\text {EG}_i\) for both events. As before, the spatial maps show a relatively small area of better accuracy near the epicenter of the earthquakes, but the variability is reduced owing to the averaging effect of considering several forecast coastal points in unison in the evaluation. Despite this, it is still possible to identify locations where a sensor could be deployed to yield the best overall performance. Therefore, the final array configuration is determined by identifying the configuration with minimum global error \(\text {EG}_i\), independently for each scenario, and for which the total error for each coastal forecast point satisfies \(F_{i,j} \le \mu =0.55\), to ensure good quality at each forecast point. The coordinates of the selected sensor array are presented in Table 1. For reference, array number 62 out of 99 is obtained for the northern scenario, and array number 66 out of 113 for the southern one. In both cases, array configurations are numbered in accordance with the grid location, from west to east, then north to south.

Fig. 5
figure 5

ad Sample spatial maps of the global error, EG, as a function of the combination of weights. From left to right, combinations 1, 10, 12, and 20, respectively. e Matrix of the error EG (northern scenario) for each sensor array as a function of the combination of weights. Array configurations are numbered in accordance with the grid location, from west to east, then north to south

Table 1 Proposed sensor array configuration

4 Discussion

The methodology presented herein allows for objective comparison among several array configurations by using three relevant tsunami parameters. The method was tested by defining a minimum of two sensors to form a sensor array, under the premise that this represents the least expensive implementation. Testing for arrays comprising a larger number of sensors, and/or tsunami parameters, could be implemented in a similar manner, albeit at greater computational cost, but is not considered further in this work.

It is possible that, for actual implementations, the selection of a sensor may be subject to additional restrictions. For example, the ability to transmit data in real time constrains to line-of-sight placement, or communications coverage; or deployment away from the trench to reduce the effect of seismic noise and coseismic signals, among many possible restrictions. Such restrictions are not considered here as they can be sensor specific, although they could be easily incorporated into the method by simply restricting the locations at which a sensor can be deployed.

The methodology expands on previous research by quantitatively incorporating different tsunami parameters into a cost function which can be minimized. However, the weights (\(\alpha ,\;\beta \), and \(\gamma \) ) in the cost function \(F_{i,j}\) (Eq. 10) were chosen based on the relative importance of each parameter in previous studies and on the hazard categorization in tsunami early warning systems. Hence, both arrival time and tsunami amplitude were given a greater weight than skill. To evaluate the influence of this selection of weights in the assessment, a sensitivity analysis was carried out. Each weight was modified by up to \(\pm \, 0.10\) in steps of 0.05. In addition, cases when only one parameter is used were also considered, yielding 23 weight combinations as presented in Table 2.

Table 2 Parameter space of weight values used

This analysis is used to evaluate whether modifying the weights induces a change in the selection of the sensor array. In Fig. 5a–d, sample spatial maps of the global error \(\text {EG}_i\) are shown for some of the weight combinations (baseline combination, and combinations 10, 12, and 20; see Table 2 for details). It can be seen that, although there is a variation of the value of the global error, and also some variation of its spatial distribution, the overall structure remains consistent. The notable exception is combination 20, which only considers \(\alpha =1.0\), i.e., only the error in arrival times. Figure 5e shows the value of the global error \(\text {EG}_i\) for each array configuration for the northern scenario, as a function of the weight combination. For all combinations, the minimum error is obtained for array 62, thereby showing no sensitivity to the distribution of weights. This could be due to several factors. For instance, perhaps the sensitivity range used in this test was not large enough to alter the result. However, even when only one parameter is considered, the solution remained unaltered. It could be argued that the solution is controlled by only one parameter. For instance, the error in amplitude \(\varDelta H\) yields the minimum errors in several arrays (see red colors for combination 21 in Fig. 5e). However, there are some other instances where the arrival time is the parameter yielding the minimum error (see combination 20). It is concluded that the use of a cost function combining both parameters maximizes the ability to capture well the overall structure of the tsunami signal. Moreover, the choice of weights initially proposed (\(\alpha =\beta =0.4, \gamma =0.2)\) appears to be a good compromise among them.

The proposed methodology considers two characteristic design scenarios located at the extremes of the area of interest. Here, the capabilities of the selected network are evaluated in other cases to ensure that the proposed configurations offer good performance not only for the design scenarios. To this end, the southern event (\(M_\mathrm{w}\) 8.5) was modeled with different epicentral locations along the subduction zone, every 0.5 arcdeg along strike (as shown in Fig. 6). In addition, the observation time was changed (\(T_0=10\), 15, and 20 min) to investigate the effect of record length on the forecast.

Fig. 6
figure 6

Initial sea-surface displacements used in evaluating the chosen sensor array. Red and yellow triangles indicate forecasting points and sensors, respectively. Epicenter is located at latitude a\(-19.5^\circ \); Parameter space of \(-20.0^\circ \), c\(-20.5^\circ \), d\(-21.0^\circ \), e\(-21.5^\circ \)

Figure 7 shows the matrices of \(F_{i,j}\) at each forecast point j as a function of the epicenter location, considering the selected array configuration, for different observation times, \(T_0\). The aggregate error decreases as \(T_0\) is increased, as expected. Aggregate errors can reach values as low as \(F_{i,j}\sim 0.1\) when \(T_0= 20\) min (compare Fig. 7a and c). In the case of Chile, where the seismic zone is located very close to the coast, this long data acquisition time may exceed the arrival time, thus preventing the use of this information as a trigger for early warnings. However, it could be considered in the later stages of the emergency cycle. For instance, this information can be used to refine initial hazard assessments derived by other means, such as the existing database of precomputed scenarios.

Fig. 7
figure 7

Forecast accuracy function \(F_{62,j}\) as a function of reference forecast point j (vertical axis) for the final array configuration, considering different target scenario locations (horizontal axis) and observation time, \(T_0\) of a 10 min, b 15 min, and c 20 min

It is possible to observe a decrease in performance as the scenario approaches the reference forecast point. This results in poor accuracy at the northern forecast points (Arica, Pisagua, and Iquique) for events with epicenters at latitudes \(-19.0^\circ \) and \(-20^\circ \) (dark-blue data in upper-left corner of Fig. 7a), and a worsening of the accuracy as the epicenters are located further south (see the evolution of the error for Tocopilla). This is due to the forecast points being placed close to or inside the zone of the predicted coseismic deformation for the actual event, which prompts an inferred early arrival due to its displacement. Moreover, owing to the short observation time, the tsunami source solution results in a source with small spatial dimensions (see, for example, Fig. 8b, albeit for a different scenario). As a result of both situations, the arrival time yields a large error, \(F_{i,j}\sim 0.9\)–1.0. In the other cases, \(F_{i,j}\) averages 0.64. As the observation time is increased, the error in the prediction drops significantly and values as low as \(F_{i,j}\approx 0.40\) are obtained, which are considered to indicate good performance of the selected array for a wide range of locations.

As an additional test to evaluate the potential performance of the sensor array, the Pisagua tsunami on 1 April 2014 (\(M_\mathrm{w}\; 8.1\)) was used as a source scenario. Note that this event has a smaller magnitude than the scenarios used in designing the network, and also that it has a nonuniform slip distribution, with two main patches of slip. Synthetic tsunami waveforms at the location of the sensor array were obtained by a numerical simulation with COMCOT of the tsunami generated by the initial rupture model proposed by Hayes et al. (2014). As before, the accuracy of the assessment is evaluated using the coastal forecast points and different observation times. Gaussian noise with a maximum amplitude of 10% of the variance of the clean tsunami waveforms was added in the numerical experiments.

Fig. 8
figure 8

Initial sea-surface displacement obtained from different observation times for the 2014 Pisagua tsunami. a Target result resulting from using the slip distribution of Hayes et al. (2014) as source scenario. bd Distribution of the initial sea-surface displacement estimated by tsunami waveform inversion considering \(T_0= 10\), 15, and 20 min. eg Comparison of observed (black), observed data filtered with a nine-point moving average (red), and forecasted (blue) waveforms at the three sensors. Vertical light-blue lines in e, f, and g denote the observation time

The results are summarized in Table 3, the initial surface solutions are shown in Fig. 8a–d, and observed (black) and forecasted (blue) time series are shown in Fig. 8e, f, for different observation times \(T_0\). Within 10 min, the performance of the method is inadequate, with large errors and an inverted initial surface that only detects a localized source. As the observation time is increased, the errors decrease significantly, especially when forecasted time series are compared with observed data filtered using a nine-point moving average (red lines). The reconstructed sea surface condition now has an appropriate spatial extent but smaller peak displacement. The solution is improved for \(T_0= 20\) min, and includes traces of the secondary peak northwest of the rupture. However, as mentioned above, such a sensor array would only have provided timely assessment for locations outside the main rupture zone, since the observed tsunami arrival at the tide gages at Iquique and Pisagua was less than 15 min (An et al. 2014; Catalán et al. 2015).

Table 3 Summary of tsunami forecasting error \(F_{i,j}\) considering different data observation times for 2014 Pisagua event

The effect of weighting was tested by comparing the accuracy of the estimates for the Pisagua tsunami for different combinations of weights. In general, most forecast points have a relatively stable assessment, typically varying by less than ±0.10 in the error estimate (Fig. 9). However, the situation changes significantly if only one parameter is used. For instance, if only time is considered (combination 20), the assessment for some forecast points is worse, e.g., Iquique and Arica. On the other hand, Patache shows a significant improvement, possibly influenced by its proximity to the fixed sensor. The situation reverses when only the skill is considered (combination 22); Arica and Iquique shows improved estimation but Patache worse. Therefore, the use of multiple parameters and multiple forecast locations in unison is essential for producing a more complete estimation of the error for determining the optimal configuration.

Fig. 9
figure 9

Forecasting accuracy from Eq. 10 considering different weight combinations in Table 2 for the Pisagua 2014 tsunami

This suggests that the proposed network is capable of identifying smaller events with nonuniform slip reasonably well, in reasonable time. It is of note also that the location of this scenario does not coincide with either of the scenarios used to design the network. However, two of the sensors are located close to, but not directly in, the main energy beam. When considered in unison, these two tests suggest that the sensor network considered would be appropriate for tsunamis generated by earthquakes of magnitude similar and larger than that of the initial design scenarios. It is possible that a different network would be obtained if different design scenarios were used. Nevertheless, the methodology provides a means for identifying a suitable sensor array. This stresses that one relevant step prior to the implementation of the methodology would be to determine these design scenarios by other means. In the present case, the choice was based on data available for the Northern Chile Gap.

The above results suggest that the methodology is capable of delivering a working tsunami observation system comprising just three sensors. Although the general rule that the spacing between sensors should be in the range \(\sim \) 200–400 km (Bernard et al. 2001) is somewhat preserved, the spacing between sensors differs, between 110 and 225 km, approximately. This yields a less dense network than the one proposed by either Schindelé et al. (2008) or Omira et al. (2009). The methodology also allows for the optimal placement of the sensors in this range of distances. On the other hand, the results show that, similar to what is assumed by Gusman et al. (2014), the optimal placement corresponds to the area where the maximum displacement occurs. Considering the inherent uncertainty in predicting the slip distribution of a future earthquake, stochastic methods could be incorporated into this methodology to further refine the results. However, with the simulations performed here, it is possible to see that just three sensors could provide a baseline solution that would have been appropriate for an event such as Pisagua 2014.

It must be noted, however, that several other factors could be considered in the analysis. Financial considerations such as implementation or operational costs, or technical constraints on sensor placement could be incorporated by limiting the placement of the sensors during the definition of the possible sensor locations. These technical constraints can arise from a wide variety of situations that could affect the accuracy of the sensors. For example, seismic noise and Rayleigh waves can contaminate the signal for sensors near the source. However, it is noted that, in some cases, pressure sensors have been installed successfully near the source, for example during the 2016 Mie-oki earthquake (Wallace et al. 2016). The results shown herein are based on the assumption that the sensors are capable of providing filtered free surface data at sufficient temporal resolution, independent of the type of sensor. While an inversion algorithm can be subject to a wide range of errors and uncertainties on its own, for the purpose of the present work, it is hypothesized that using a common inversion procedure will weigh equally those errors, allowing comparison among different sensor configurations. The main goal of the presented procedure is the objective assessment of the best configuration, and it can be applied independently of the inversion method or nature of the sensors.

The implementation of the procedure is computationally intensive in two of its steps. First, establishing the database of Green’s functions can be time consuming. The other aspect is the need to carry out inversions for a large number of sensor configurations. While in this case they were few (212 cases) because only pairs of sensors were considered, if a larger number of sensors is used, the processing cost could increase significantly. In such cases, it would be possible to apply optimization approaches to reduce the computational burden. Finally, to reduce the computational cost, forecasts at coastal stations were obtained by linear superposition. The use of this approach at shallow coastal points might be subject to inaccuracies arising from neglecting nonlinear interactions. Additionally, bathymetry-induced effects such as resonance were reduced here by placing these sensors at depths of about 200 m.

The use of tsunami data for the very first early warning is constrained by the distance to the source. At locations where arrival times are very short, it is also important to consider the actual benefit of incorporating such an evaluation for early warning. The present results reaffirm that short observation times yield less accurate results, highlighting the trade-off problem between a quick assessment and accuracy. The problem is further compounded by inherent uncertainties in inverting both earthquake and tsunami data (Cienfuegos et al. 2018). It seems reasonable to propose that, for cases such as this, higher-quality information obtained from inversion methods should be used for hazard assessment at later times in the emergency cycle, for instance to refine fast hazard assessments estimated by other means, and not necessarily as the method for first evaluation.

5 Conclusions

A methodology based on numerical simulations of near-field tsunami forecasting is presented to determine an optimal array configuration of offshore tsunami sensors. To provide an objective basis for the comparison, three parameters are considered simultaneously to assess the accuracy, thus expanding on previous methodologies that rely solely on a single parameter (e.g., arrival time). The joint use of the three estimators, viz. arrival time \(\varDelta T\), tsunami amplitude \(\varDelta H\), and model skill \(S_k\), was robust when compared against a single parameter.

The methodology was tested in Northern Chile. Results showed that a configuration comprising just three sensors is capable of providing accurate estimations of tsunami arrival time and peak amplitude of the first wave. In this way, three sensors suffice to cover a \(\sim \) 700 km stretch of coast when earthquakes in the \(M_\mathrm{w}\) 8.0 range are considered.

Results show that there is a strong dependence between the location of the sensors and error estimators; arrival times are accurately predicted with sensors located offshore from coastal point of interest. In addition, better results for tsunami amplitude and skill are obtained when sensors are located inside the uplifted area or in front of it, near the main energy beam.

The proposed methodology shows potential for use as an operational tool in defining the location of possible tsunameters, especially for sparse configurations in countries where the financial cost of implementing and maintaining dense networks could be a hurdle.