1 Introduction

Seismicity in Germany is generally low compared to other areas close to plate boundaries, such as Turkey, CA, Mexico, and Japan, though the seismicity and the corresponding level of seismic hazard are not negligible (Grünthal et al. 2018). Seismic hazard is generally higher in several areas, mainly in the western and southern parts along the Rhine River, on the Swabian Alb, as well as in the Vogtland region in central Germany. The Lower Rhine Embayment is of particular interest as it is densely populated (around 3 million inhabitants) with a large number of industrial facilities while the seismic hazard level is high as shown by previous seismicity. The 1756 Düren earthquake with a moment magnitude scale of Mw 5.9 (the strongest in recorded history to occur in Germany), the Mw 5.7 1878 Tollhausen earthquake (with casualties in Germany), and the 1992 Roermond earthquake with a magnitude of Mw 5.3 (with one casualty, 7200 damaged buildings, and 250 m Deutsche Mark in damages) are a few examples (Grünthal 2004; Grünthal et al. 2018), revealing the importance of investigating the potential value of having an efficient EEW system in this region.

EEW is mainly based on two principles that enable alerts to be sent ahead of the arrival of earthquake-induced ground shaking at target locations: First, information about the occurrence of an earthquake using modern telecommunications technology travels faster than seismic waves, and second, most of the energy of an earthquake is carried by S and surface waves, which reach the sites of interest after the P waves. This time difference should allow for the detection of an earthquake and the issuing of warnings. In general, there are different approaches for EEW systems, as explained in detail by Allen et al. (2009): in-front detection, using the P wave, and on-site warning. For in-front detection, warnings are released based on the level of ground shaking observed at stations closer to the earthquake’s epicenter. With on-site EEW, only a single station is used to detect P wave arrivals. Warnings are then issued for the same location depending on the predicted peak ground shaking determined from the early P waves. Then, depending on the available warning time, a range of actions such as shutting down electricity or gas supply, decreasing the speed of trains, or alarming hospitals to suspend their activities can be carried out (Allen et al. 2009; Nakamura et al. 2011; Stankiewicz et al. 2013). Even with only a few seconds of warning time, such actions can significantly prevent disasters that can follow earthquakes.

In recent years, a variety of studies have dealt with improving EEW methodologies (Hoshiba et al. 2013; Meier et al. 2019). Some EEW systems provide warnings based on real-time data using ground motion prediction equations (GMPE) to estimate the level of ground shaking at the users’ location (Minson et al. 2019). The Japan Meteorological Agency (JMA) procedure and the ShakeAlert system for the USA’s west coast are examples of such real-time EEW systems (Kuyuk and Susumu 2018; Minson et al. 2019). Other approaches provide only qualitative alerts for shaking (e.g., Oth et al. 2010). Some current examples of real-world applications in the utility sector are an integrated EEW/rapid response system in Istanbul for the Natural Gas Network (Zulfikar et al. 2016) or the Bay Area Rapid Transit (BART) in the San Francisco Bay area, where following ShakeAlert information being disseminated, the goal is to slow down or ideally stop trains before strong shaking arrives to avoid train derailments (Wald 2020). However, in the European context, determining longer warning times such as what is possible in Japan or Mexico is difficult, if not impossible, due to the tectonic setting of the region.

The seismic hazard in Europe and Germany is dominated by many smaller and medium-sized seismic sources, often very close to inhabited areas. This reduces the warning times in many cases to just a few seconds, hence mainly only allowing automated, very rapid emergency actions to be undertaken and mostly excluding the possibility of alerting the general public. Due to the short warning time, a decentralized approach is explored as an alternative or complement to the existing regional approaches as decentralized EEW might have greater potential in terms of promptly providing information for quick actions in similar environments (Parolai et al. 2017; Prasanna et al. 2022). Herein, each recording unit of the network processes in real time the data it records and only disseminates the results through the network.

The aim of the present manuscript is to investigate the influence of the network geometry for a decentralized EEW system with a particular focus on the Lower Rhine Embayment (LRE) in Germany as alert times are significantly affected by the network geometry (e.g., Zollo et al. 2009; Kuyuk and Allen 2013; Nof and Allen 2016 among others). While a set of seismic stations already exists in the study area, these networks have not necessarily been established for EEW warning purposes, meaning that their design is not optimal to deliver the quickest and most reliable warnings once an earthquake occurs. Adding a limited number of stations at selected sites with the aim of improving the performance of the EEW system is of great importance. To this regard, we adopt the network optimization approach developed by Oth et al. (2010), which was successfully applied to design EEW systems in Istanbul (Turkey) and Almaty (Kazakhstan, Stankiewicz et al. 2013). A fundamental requirement for such an optimization approach is the availability of a sufficient number of earthquake recordings that are representative of the seismic activity in the study area. However, the historical catalogs available for this region only cover a limited observation period, meaning that the use of such historical databases is insufficient to fully capture the properties (potential locations and magnitude) of relevant future earthquakes. Stochastic catalogs (i.e., realizations of the seismicity for a given time period) inferred from seismicity models developed in Probabilistic Seismic Hazard Assessments (PSHA) representative of the mean hazard branch can help bypassing these constraints. While a historical catalog can be seen as one possible and short-time window realization of the regional seismogenic sources, stochastic catalogs issued from the PSHA model considering longer time windows (e.g., 10,000 years) are equally probable realizations of the future seismicity. Böse et al. (2022) developed a similar approach to evaluate and optimize an existing sensor network for EEW in a loss-based (fatalities and injuries) context using 2,000 realizations of a 50-year-long stochastic earthquake catalog that samples the earthquake rate forecast of the Swiss Hazard Model in space and time. The model used in the present analysis samples the earthquake rate forecast of the German national seismic hazard model in space and time considering both real sources and known tectonic faults in the area as seismic sources in the analyses (branch C in Grünthal et al. 2018). This branch is mainly chosen to constrain and account for major earthquake ruptures on fault and off-fault seismicity.

In the following, we first present the optimization algorithm based on this hazard-informed stochastic catalog. In the second step, we explain the optimization approach, including the cost function calculation and the microgenetic algorithm. Following a description of the study area and the stochastic catalog, the results of the optimization are presented and discussed.

2 Method of network optimization

A critical aspect of an EEW system’s performance is its ability not only to give timely but also reliable warnings for the level of expected shaking, meaning that an optimal balance has to be found such that the warning times for the target site(s) are maximized and that the confidence level in the warnings is as high as possible. Herein, the phrase “warning time” refers to the system-level warning time (i.e., the exceedance of a given ground-motion trigger threshold at a predefined number of recording stations before the exceedance of the threshold at the target site. We will not focus on the lead time defining the time relative to the arrival of the S waves at the target site.) This balance must be kept since increasing the time before an alarm is sent out will increase the level of confidence but limit the time before the strong shaking arrives at the site of interest. During the optimization tests, a number of candidate sites are considered, and the warning times achievable with that group of selected stations was assessed together with the accuracy of the warnings while ignoring inherent processing and system latencies. The latter issue is discussed later on.

It is important to note that the goal of EEW is to deliver a warning after an earthquake to those areas or interested parties where the ground motion is expected to exceed certain thresholds. These thresholds can be defined by those receiving the warnings (Minson et al. 2019), which in turn are related to different warning classes. In this work, each class threshold represents a range of PGA (peak ground acceleration) values. Any event leading to a PGA less than the first threshold at the target site is defined as class 0 (the exact threshold values are discussed later). Events generating PGA between the first and second thresholds are defined as class I, between thresholds 2 and 3 are defined as class II, and, finally, events generating PGA greater than threshold 3 are defined as class III. Designing EEW network therefore means finding the optimal location of the stations that correspond to an acceptable balance between the warning time and the correct identification of the event class. For this purpose, a set of PGA thresholds (also called trigger thresholds) for the EEW network has to be identified by the optimization tool (in general different from the thresholds defined for the target). The trigger thresholds are used to associate each event with a specific warning class when the corresponding trigger threshold is exceeded in at least three recording stations, while target site thresholds are always set to given values depending on the users’ needs.

2.1 Cost function

To quantify the performance of each possible configuration of a set of candidate stations in the network, a cost function is defined. The better the performance of the network, the lower the cost value will be. Following Oth et al. (2010), the adopted cost function is expressed as

$$\mathrm{cost}={\sum }_{i=1}^{{N}_{\mathrm{evt}}}{W}_{i}\left[L.\left(1-K\right).\mathrm{sigmoid}\left({t}_{\mathrm{warn},i},{t}_{\mathrm{center}},S\right)+K\right]$$
(1)

with

$$\mathrm{sigmoid}\left({t}_{\mathrm{warn},i},{t}_{\mathrm{center}},S\right)=1-\frac{1}{1+\mathrm{exp}\left[-S\left({t}_{\mathrm{warn},i}-{t}_{\mathrm{center}}\right)\right]}$$
(2)

where \({N}_{\mathrm{evt}}\) is the number of events, Wi represents the weight of event I, L is equal to 0 for events with an expected class of 0 and is equal to 1 for events with expected class I, II, or III, and K is 0 for events that are correctly classified and 1 for events that are not correctly classified. The warning time (\({t}_{\mathrm{warn}}\)) for each class is the time between the exceedance of the corresponding trigger threshold at a predefined number of recording stations and the time of the exceedance of that class at the target site. At this stage, we ignore inherent system latencies. While all calculations here are carried out for three recording stations, this value is not fixed though the more stations one requires, the longer the wait before an alert is declared. The reasoning behind using at least three recording stations is that using three stations will allow the event to be sufficiently well constrained. The sigmoid function is defined to affect the cost values depending on the amount of warning time available. The sigmoid function (an example is shown in Fig. 1) is centered at \({t}_{\mathrm{center}}\). If the warning time is large enough (\({t}_{\mathrm{warn}}\)> > \({t}_{\mathrm{center}}\)), the sigmoid function will be equal to 0; if the warning time is moderate (\({t}_{\mathrm{warn}}\) = \({t}_{\mathrm{center}}\)), it will be equal to 0.5; if the warning time is not sufficient (\({t}_{\mathrm{warn}}\)< < \({t}_{\mathrm{center}}\)), the sigmoid function will be equal to 1. S is the spread parameter of the sigmoid function to control its spread over time (Oth et al. 2010; Stankiewicz et al. 2013).

Fig. 1
figure 1

Sigmoid functions as defined in Eq. (2) and tested in this study. The thick black line represents the sigmoid function used for class I, II, and III events. The reason behind these particular parameters will be discussed in Section 5. Sigmoid functions plotted with the thin blue line correspond to the class I and II events of Oth et al. (2010) for Istanbul, Turkey. The \({t}_{\mathrm{center}}\) times are marked with the small squares accordingly

The parameters of the sigmoid function are chosen according to the arrival times in our case study, which will be explained in Section. 5. Figure 1 shows the sigmoid function used in this study and the sigmoid functions used in the previous work of Oth et al. (2010) for Istanbul. Although their shapes are similar, warning times in the LRE are even shorter than in the Istanbul study.

In a sigmoid function, warning times greater than the center time, \({t}_{\mathrm{center}}\), of the sigmoid function are considered sufficient (lower cost value), and shorter warning times than the center of the sigmoid function are treated as insufficient and produce a higher cost value tending toward 1 (Stankiewicz et al. 2013).

2.2 The microgenetic algorithm (MGA)

Having a much larger number of candidate sites than the stations to be deployed means that a large number of possible station combinations for the desired network exist. In order to find the most promising solutions to this problem, a microgenetic algorithm is used (Coello and Pulido 2001). Microgenetic algorithms are a sub-branch of genetic algorithms (Goldberg and Holland 1988). In brief, a genetic algorithm (GA) is a guided search technique consisting of an iterative optimization through the selection of the best test models called chromosomes (i.e., those with lowest cost), their genetic recombination, and their mutations. This procedure is based on evolutionary principles. The algorithm starts with a random population (random arrangement of n stations out of the candidate stations), as shown in Fig. 2. Then, the cost value is calculated for each chromosome (a combination of station locations). The fittest chromosome is unchanged throughout the run of several iterations of the microgenetic algorithm as the population is reinitialized randomly. This helps to increase the diversity in the tested populations.

Fig. 2
figure 2

Simple illustration of microgenetic algorithm: each station configuration represents a chromosome made of n (8 in this illustration) stations. Each letter represents a station location. For each chromosome, the cost function is calculated according to Eq. 1. After several iterations, the fittest chromosome with the lowest cost value is kept

The MGA parameters used in this study are similar to those applied in Oth et al. (2010). Since we are interested in deploying nine stations in addition to the five stations already existing in the area, the number of candidate stations was set to nine and the population size was set to 14 chromosomes to ensure sufficient genetic diversity and save computational powers. No mutation was used, and the crossover probability was set to 0.95. The algorithm was run for 50 generations, and in each case, 600 best solutions were tested.

At the end of the optimization, the fittest chromosome of the microgenetic algorithm representing the best station configuration will be taken as the best solution of that micro-GA run. As the optimization problem is typically non-unique and as there is generally no single “best solution,” we select the best solution out of different runs of the the micro-GA algorithm by scoring the various solutions, and this scoring is based on the frequency of the stations in each particular solution among all the solutions. If the frequency of the appearance of some stations is more or less the same, we use the spatial coverage of the network with respect to the prevailing seismicity distribution as an additional decision criterion.

3 Study area

The Lower Rhine Embayment (LRE) is part of the European Cenozoic Rift System, which evolved mainly during the Neogene because of the lithospheric response to the Alpine Orogeny and the opening of the Atlantic (Ziegler 1992; Michon et al. 2003). The LRE is characterized by slow NE-SW extension and a set of normal faults (such as the Erft fault) that separate a number of NW-elongated tectonic blocks in a horst and graben style (e.g., Geluk et al. 1995; Houtgast and van Balen 2000; Grützner et al. 2016).

Camelbeeck et al. (2007) showed that earthquakes with magnitudes equal to or greater than 6.0 have occurred since the fourteenth century in the region extending from the LRE to the southern North Sea. Furthermore, Camelbeeck et al. (2020) estimated the annual rate of seismic moment in the Lower Rhine Graben to be between 9.0 \({10}^{+15}\) and 1.7 \({10}^{+16}\) N.m/year and the horizontal rates of extension along the faults to range from 0.07 to 0.13 mm/yr.

In this area, two seismic networks (a weak-motion and a strong-motion network) operated by the North Rhine-Westphalia Geological Service (GD NRW) with the purpose of monitoring local seismicity are currently in place. Additional stations are aimed to be added such that in the event of an earthquake, all stations will together act as a regional EEW system to best deliver the most reliable and fastest warnings for our studied target site (blue triangle in Fig. 3b), which is a chemical industrial plant located in the city of Hürth (the chemical park Knapsack), about 10 km southwest of the city of Cologne. In this way, the final network will be a user-oriented hybrid network for an EEW system for this area involving combining the existing system of the GD NRW and our decentralized approach.

Fig. 3
figure 3

a The Lower Rhine Embayment, Germany and the study area (black box). b Existing stations are marked by blue triangles, the target site (the chemical park Knapsack near the city of Cologne) is marked with a pink square, and the cataloged events in the area (Grünthal and Wahlström, 2012) are shown by gray circles. The fault lines in red are from Fault2SHA (Faure Walker et al. 2021). The Mw 5.8 1756 Düren earthquake is marked by the red star, and the Mw 5.4 1992 Roermond earthquake is marked by the yellow star

4 Modeling potential earthquake locations and magnitudes

Considering all possible seismic sources in the network optimization approach requires a representative sample of earthquakes of potential relevance for EEW (including an appropriate set of smaller events where no warning should be given). In this work, we rely on seismicity models developed in the framework of PSHA for defining an appropriate earthquake catalog. These models aim at capturing the seismogenic properties (e.g., earthquakes rates, maximum magnitudes) of a given region. Stochastic catalogs (realizations of the seismicity for a given time period) inferred from these models can be considered long time windows of equally probable realizations of the future seismicity.

The stochastic catalog was generated using the event-set calculator in the OpenQuake engine (Pagani et al. 2014). From the earthquake rupture forecast, which is the complete inventory of ruptures the source model can generate (given the configuration settings), including their 3D geometries, a 10,000-year-long stochastic catalog was generated. The time period of 10,000 years can generally be considered to fully capture all potential locations and magnitudes of future earthquakes. Similar synthetic catalogs were also used by Razafindrakoto et al. (2021, 2022). For area sources, each rupture is generated as a virtual fault, which is a simple planar fault whose area, aspect ratio, rake, and orientation (strike and dip) are controlled by the magnitude of the earthquake and the tectonic properties of the area source. For tectonic faults on the other hand, fault sources can change direction along strike but have a constant dip.

For each generated rupture, the number of occurrences within the investigation time (10,000 years) is sampled assuming a Poisson model. If the number of occurrences of the rupture in the event set is greater than zero, then the corresponding rupture is retained, and this becomes an event in the stochastic event set. For every rupture, a 3D geometry coming from the source model configuration and all the relevant properties (e.g., magnitude, dip, rake, hypocenter) are retained. For the area sources, a 5-km mesh spacing is used.

The ground motion database is built using the finite-fault stochastic method of Motazedian and Atkinson (2005) and Atkinson and Assatourians et al. (2015) (EXSIM method), tailored for EEW purposes by including simulations of the P waves. In the EXSIM method, the fault is subdivided into point sources. The Fourier spectra of the expected ground motion are computed for each point source where the ground motion time series is obtained by generating Gaussian noise, modulated to match the target amplitude and superimposed for all point sources. In principle, the EXSIM method only generates S wave ground motion simulations, while for EEW purposes the compressional waves need to be included. Following Böse (2006), we simulate the P and S waves separately using appropriate source spectrum terms, inelastic attenuation, radiation patterns, and wave speed, and then combine them with the respective arrival times. A similar procedure has been proposed by Oth et al. (2010) for Istanbul, Turkey and by Stankiewicz et al. (2013) for Almaty, Kazakhstan.

Following this procedure, ground motions were simulated for 284 scenario earthquakes (in the magnitude range of 4.5–7.3) at 76 preexisting and potential station sites in the LRE area (see Fig. 4). The 1992 Roermont and 1756 Düren earthquakes are among these scenario earthquakes. For the scenario events, we only select ruptures that are likely to generate a PGA > 0.02 g. Following Stankiewicz et al. (2013), for the optimization approach, the crucial issue is that all potential earthquake sources are considered. For the potential stations, the range of the area of investigation was preselected. Then, the cities and villages within the area where the possibility of deploying stations exists were selected such that the candidate stations would have, to the extent possible, a uniform distribution pattern. This procedure leads to a number of collections of 76 candidate sites. The adopted earthquake source characteristics (magnitude, source mechanisms) in the simulations are taken from the stochastic catalog, and the corresponding fault dimension is defined based on the scaling relationship of Leonard (2014). The regional kappa value (Anderson and Hough 1984), which describes the spectral decay of the acceleration spectrum at high frequencies, is set to 0.01 s following Pilz et al. (2019). Site amplification factors have been calculated from the velocity model of Pilz et al. (2021). As this model does not cover all of the study area, we calculated amplification factors with the slope-derived proxy vs30 values of Wald and Allen (2007) over a 30-arcsec uniform grid. The remaining EXSIM key parameters (e.g., stress drop, attenuation) are defined based on the spectral decomposition results of Bindi and Kotha (2020).

Fig. 4
figure 4

a Candidate sites and the stochastic catalog. The white triangles show the locations of the candidate stations. Existing stations are marked by blue triangles. The target site is marked by the pink square. Purple lines mark the faults in the area. Simulated scenario events are indicated by black circles. b Bar plot of the Mw-depth distribution of the simulated events from the stochastic catalog

In a first step to assess the simulations, Fig. 5 compares the consistency between simulations and empirical ground motions as a function of source-to-site distance. The empirical prediction models considered in this figure are based on Bindi et al. (2017) and were calibrated for the hazard application in moderate- and low-seismicity areas. The median and the standard deviation of Bindi et al. (2017) (vs30 = 350 m/s, i.e., the average S wave velocity in the uppermost 30 m) are displayed as thick and thin lines, respectively. The simulated ground motions follow the trend of the empirical model with no significant under- or overprediction.

Fig. 5
figure 5

Comparison of the overall distance attenuation of peak ground acceleration from simulations (blue dots) and empirical ground motion models of Bindi et al. (2017) for events with magnitudes of (left) Mw 5.0 and (right) Mw 6.5

5 Data insights and maximum warning times

The distribution of P and S wave arrivals at the target site in the stochastic catalog is shown in Fig. 6. The short time windows between the occurrence of earthquakes and the time when their P and S phases arrive at the target site show the challenges for EEW in this area where the distances between the target site and the earthquake sources are rather small (of the order of a few to a few tens of kilometers).

Fig. 6
figure 6

Distribution of the travel times of a P waves and b S waves for all simulated events. The red lines show the kernel density distribution of the arrival times added for a simplified distribution representation

In this work, we define as warning class I those with a trigger threshold of PGA taking values between 0.02 and 0.05 g, class II between 0.05 and 0.1 g, and class III with PGA values greater than 0.1 g. These threshold levels are identical for all stations. Since we are interested in earthquakes that cause possible damages at the target site, similar PGA thresholds for the target site are chosen. The minimum PGA threshold is set to 0.02 g, which is similar to previous studies (e.g., Oth et al. 2010; Minson et al. 2019) as this value has been found to be large enough to avoid false alarms caused by small earthquakes and anthropogenic activity but small enough to provide reliable warnings and sufficiently fast for moderate to large events with strong shaking. As discussed by Oth et al. (2010) for the case of Istanbul, an increase of the threshold levels would lead to a significant reduction in the warning times, in particular for the class III events while a decrease would negatively affect the alert accuracy (Minson et al. 2019).

Looking at the time distribution between the exceedance of triggering thresholds at the candidate stations and the exceedance of thresholds at the target site, as shown in Fig. 7, helps in understanding the potential and the corresponding limitations for EEW in the area. In this figure, the time difference between the exceedance of class-definition thresholds 1, 2, and 3 at the target site and the exceedance of trigger thresholds at three candidate stations are plotted for all simulated events. Considering the number of candidate stations (76) and the number of events (284), the potential time to issue a warning does not reach 4 s for many events (i.e., bars to the right of -4 s in Fig. 7). Threshold 3 events have overall larger potential warning times. For threshold 1 warnings at the target site, most of the data is concentrated below 4 s, i.e., most stations cannot contribute much to an EEW system requiring at least 4 s for issuing a warning. Due to the short travel times and corresponding short maximum possible warning times (as seen in Fig. 7c), a small value for \({t}_{\mathrm{center}}\) is necessary. Therefore, we set the center time of the sigmoid function (as shown in Fig. 1) to 4 s, allowing for not ignoring events with warning times above 4 s.

Fig. 7
figure 7

The distribution of the largest theoretically possible warning time between the exceedance of the PGA trigger threshold at the first three candidate stations and at the target site ignoring system latencies for three classes at the target site: a class I, b class II, and c class III. These presented times for the various classes at the target site are independent of the class observed at the candidate sites. Negative time values indicate the time at which the respective level of ground motion is exceeded at the candidate site before the ground shaking reaches the target site. The dashed lines show the potential warning times larger than 4 s

While some candidate stations deliver larger potential system warning times more often than others, we look for the stations with the largest number of potential system warning times greater than 4 s for all simulated earthquakes. Figure 8a shows the locations of nine potential stations (the number of additional stations we are interested in establishing) with the largest number of warning times above 4 s. Please note that while these stations have arrival times greater than 4 s for most events, this success does not apply to all simulated earthquakes.

Fig. 8
figure 8

Analysis of the candidate stations with the highest number of large warning times as shown in Fig. 7. a Stations with the largest number of system warning times larger than 4 s are shown with the black triangles, and the remaining sites are marked in gray. The epicenter of the sample event is marked by a pink star, and the target site (target) is marked by a pink square. Blue circles show the location of the existing stations. b Simulated waveforms at the candidate sites (in gray or black as in a) for the scenario event. The target site seismogram is colored in pink, and seismograms of the existing stations are colored in blue

An example of waveforms recorded by all stations is plotted in Fig. 8b for a sample event (star in Fig. 8a). The modeled event represents a quite particular case (i.e., an earthquake occurring far from the network and the target site) where the set of stations can provide a considerable warning time for the target site although we are ignoring system latencies. However, this is not the case for several other earthquakes in the area due to the scattered distribution of the seismicity (see Fig. 4), which makes this simple approach, i.e., selecting the EEW stations based on simply looking at the system warning time histograms, not optimum.

6 Optimization results and discussion

An overview of the results of the optimization procedure is shown in Fig. 9. The more often a given station is recommended as being part of the best network configurations in the different runs, the bigger is its triangle in this map. While some candidate stations are identified as part of the best network configuration in many runs (large triangles in Fig. 9), for stations that were selected less frequently but equally often with others, spatial coverage and proximity to the prevailing seismicity distribution were used as an additional decision criterion. From this set of most recommended potential stations, also considering the existing stations already existing in the area, a group of stations has been selected as the final EEW network, as shown in Fig. 10.

Fig. 9
figure 9

Stations appearing most frequently in the 600 runs for a ten-station network with the lowest cost values in the set of best network configurations considering the existing stations in the area. Candidate stations are marked by blue triangles. Their dimension is proportional to the frequency of the stations to appear in each particular solution among all the solutions. The target site is represented by the purple square. The red circle around the site in the center indicates the stations most often recommended (110 of 600 runs, i.e., 18% of all runs) by the algorithm

Fig. 10
figure 10

The station configuration of the best seismic network for EEW purposes. Starting from 76 candidate stations (white triangles) and the existing network (blue triangles), the proposed locations of the stations to be added to the final seismological network are shown as red triangles. Black circles represent the simulated events, and the target site is represented by a purple square

One key question is of course how much these additional stations can contribute to improving the existing system. To answer that question, the case of using only the already deployed stations as an EEW system is compared with the outcome when using both the existing and additional stations. The resulting warning times and event classification for the target site are used to show the performance of the EEW system, with the warning time distribution of the new EEW system using the combination of current and proposed stations shown in Fig. 11. The mean and median value of the warning time for the potential target site, ignoring processing and transmission latencies and using only the existing stations, are 4.1 and 3.4 s, respectively, while the mean and median of warning times using the EEW system with the additional stations increase to 5.2 and 4.5 s.

Fig. 11
figure 11

a Correctly (blue) and incorrectly (orange) classified events for three warning classes (0.02, 0.05, and 0.1 g) using the existing stations only. b Corresponding warning times using only the existing stations. The blue line shows the kernel density distribution of the warning times for a simplified representation of the distribution. The dashed red and the green lines represent the mean and the median values. c As in (a) but using the combination of the existing and proposed stations. d As in (b) but for the enlarged network

In total, the number of correctly classified events is improved significantly once we add the stations at the selected sites to the existing network (more events correctly classified in the same class, blue bars in Fig. 11a and c). While an overall improvement is clearly visible for classes I, II, and III, for class 0 (events with ground shaking of less than 0.02 g at the target site), the classification accuracy decreases (i.e., we have a few more false alarms) when adding stations to the existing network sites. To have a closer look at this group of events, the locations of the events causing the class 0 shaking at the target site are plotted in Fig. 12.

Fig. 12
figure 12

Events for which the ground shaking level at the target site is less than 0.02 g (class 0) represented by white stars. Black dots show the location of the rest of the simulated events. The existing stations are marked with blue triangles, and the additional stations are marked with red triangles. The pink square marks the target site

As can be seen in Fig. 12, the class 0 events are mainly concentrated around the additional stations while they have an overall larger distance from the existing stations. For such events, also, the target site will record the same low shaking level (less than 0.02 g). In other words, the farther the target site is from the selected candidate stations, the higher the possibility of false alarms and overestimating the shaking at the target site. On the other hand, the additional stations increase the accuracy for the target site for earthquakes of classes I, II, and III, and they also increase the warning time, as seen when comparing Fig. 11b and d. While the increase in false alarms is obviously not aimed at, it demonstrates the challenge of striking a balance between accurate and timely warnings. While the performance of the updated network can mathematically be quantified by a single cost value, it is important to carefully analyze the network performance, as information might potentially be lost in the averaging process defined in Eq. (1).

While so far the focus has been on a moderate level of shaking, we exemplarity test the performance of the EEW system for a number of spatially well distributed strong motion events (Fig. 13). With the underlying assumption that at least three exceedances are needed to issue an alarm (i.e., at least three green diamonds, three yellow diamonds, three red stars), the additional stations are clearly beneficial as earlier warnings, represented by the difference of the third exceedance of the trigger threshold and the exceedance of the class-definition threshold at the target site, can be released. The larger the time difference of the third exceedance of thresholds (third diamond) and the exceedance of the threshold at the target site (dashed line of the same color), the better the additional stations have contributed to the EEW system. Using the additional stations in the EEW system increases the warning time by up to 9.8 s (threshold 3 for event shown in Fig. 13b for these sampled events. Such a significant increase is, however, not expected for earthquakes happening very close to the target site, such as the earthquake scenario shown in Fig. 13d. For such events, additional stations will bring out no benefit as the warning times would be too short.

Fig. 13
figure 13

Increase in EEW warning times due to adding stations at sites from the optimization approach. Simulated waveforms for an event close to the epicenter of the 1992 Roermond event (a and b) for an event similar to the 1756 Düren event (c) and for a scenario event studied by Pilz et al. (2019, d). Each plot shows the waveforms recorded by the existing (grey lines) and by the additional stations (black lines). The waveforms recorded at the target site are plotted in pink. For each waveform, the time of exceedance of each trigger threshold is marked by green diamonds (for threshold 1), yellow diamonds (for threshold 2), and red stars (for threshold 3). For the target site, the exceedance of target-site thresholds 1, 2, and 3 is marked by vertical green, yellow, and red dashed lines. e Map of the network and the location of the additional (red triangles) and existing (blue triangles) stations. Analyzed and discussed events are marked by a star

The warning times discussed so far, however, do not include any data latencies and processing delays. This includes (1) the trigger delay, i.e., the time it takes between the arrival of the P wave at the sensor and the exceedance of the trigger level, (2) the transmission delay, i.e., the time needed to transmit the information between the sensors, and (3) latencies associated with alert dissemination. While the first one depends on the complexity of the underlying algorithm, the second factor only depends on telecommunication factors. With algorithmic improvements, data latencies of less than 1 s can be achieved nowadays (Clinton et al. 2016), while transmission delays will account for 0.1 to 0.5 s depending on the communication protocol. On the other hand, the decentralized structure will have a positive impact on the latter point. As the event detection will be carried out on each sensor, the decentralized structure will have a positive effect on the system latency as each sensor will only communicate with the neighboring sensors (outlined by Prasanna et al. 2022). The system latency of the centralized processing will increase with the number of sensors in the network as the centralized network only has single processing unit for the entire network compared to the proposed decentralized processing in which each sensor processes the algorithm for the neighboring sensors. In total, therefore, for the decentralized approach, latencies in the order of or slightly larger than 1 s have to be added to the presented warning times.

Finally, the inclusion of the additional stations for an EEW system is also tested for several strategically important cities in the area, as shown in Fig. 14. As can be seen, the designed network is optimal for the target site to have the highest number of events with the same class detected. Therefore, the improvement of events identified with the same class and increase of the warning times after adding the stations to the EEW system is significant. Classes I, II, and III are predictable with higher precision. This is not the optimal network arrangement for the tested cities to predict the ground-shaking class without further processing and adding additional factors to the EEW analysis. However, we observe a rather significant increase in the resulting warning times. In other words, the additional stations will not only be helpful for the target site but also for potential applications in the major cities in the region (Cologne, Bonn, Aachen, and Wermelskirchen which are marked in Fig. 13), subject to some extra processing. Since the EEW is not designed explicitly for each of these cities, further factors must be considered before issuing a warning to estimate the correct class. For example, for Aachen, class III has an improvement in classification, and the average warning time for EEW increases. For Bonn, there is an improvement for class II events, although in general there is a strong mismatch of classes, particularly for class III, while the warning times increase for Wermelskirchen. The class misidentification for these cases occurs because the network’s stations are mostly further away from these locations. As already stated above, lead and warning times are rather short and will therefore most likely only allow automatic actions to be taken.

Fig. 14
figure 14

Comparison of the performance of the EEW system before and after adding the stations for additional strategically important cities shown in Fig. 13: a Cologne, b Aachen, c Bonn and d Wermelskirchen. Plots (I), (II), and (III) represents the performance of EEW using only the existing stations, while (IV), (V), and (VI) indicate the performance using the additional stations besides the existing ones. Histograms in plots (I) and (IV) represent the proportion of the events for each class. Plots (II) and (V) show the proportion of events that have been identified with the same class as observed at the target site. Plots (III) and (VI) indicate the change in the corresponding warning times

Obviously, the designed network has been optimized for the specific target site only. However, if this EEW system should be used for the cities discussed above, one can add additional factors to deal with the specific situation, meaning the difference of classes between the target cities like Cologne, Bonn, or Aachen and the candidate stations. For example, the number of correctly classified events as 0, using only the existing stations, is higher for Cologne. Including the additional stations, this number drops since the stations are closer to the epicenters of the earthquakes and do not classify the shaking caused by the earthquakes as class 0 anymore. In other words, the farther the city is from the selected candidate stations at the center, the higher the possibility of false alarms and overestimating the shaking. While such results obviously depend on the choice of class definition thresholds for the ground motions assigning each event to class 0, I, II, or III (i.e., on the choice of what level of ground motion is considered severe), it demonstrates the challenge of striking a balance between accurate and timely warnings.

However, eliminating the stations would not be beneficial as it would be followed by a loss in the warning time that could have increased. In contrast, simply using the additional stations for issuing an alarm after the exceedance of trigger thresholds would lead to false alarms. Stankiewicz et al. (2015) have suggested to use the spectral content of the first few seconds of the P wave to provide a better event classification than using a threshold system only. We have, however, not adopted this approach as an improvement was found mainly for events between 60 and 100 km, which is farther than most of our studied events. It would be necessary to investigate whether the use of existing ground motion models could help in providing faster and/or better constrained warnings. Though widely used, theoretical works, however, recently have suggested that the source-based methods are inherently limited in terms of the accuracy and timeliness of its prediction (Minson et al. 2019; Hoshiba 2020). As an alternative to source-based methods, wavefield-based methods (an overview has recently been given by Hoshiba 2021) skip the process of quantifying source parameters to predict the strength of ground motion. Future ground motions are instead directly predicted from observed ground motion. Because quite dense observation networks are necessary, such approaches have not been applied widely at present except for few cases in Japan, but such approaches might be possible in the LRE once the additional stations have been set up. Moreover, earthquake early warning has also recently gained interest from end-to-end machine learning approaches developed to predict shaking in locations that do not host seismic stations (Jozinović et al. 2020; Münchmeyer et al. 2021).

7 Conclusions

In this study, we developed a target-oriented network layout for an EEW system for the Lower Rhine Embayment in western Germany. The synthetic event catalog used in the optimization approach is based on a PSHA seismicity model and therefore fully hazard-compatible. A basic requirement of the presented approach is that the database of synthetic seismograms can be considered appropriate for the respective EEW calculation. A microgenetic algorithm is then used to identify additional network sites complementing the existing sparse network. Although some stations already exist in the region which can be used to aid the network, there are nonetheless critical locations at which further stations are necessary. The accuracy of the warnings at the target site is checked for three different trigger threshold levels although the exact threshold values can be designed against what is exactly needed by the (end-) users. Due to the scattered and spatially less constrained seismicity, the warning times in the Lower Rhine Embayment are overall shorter than in previous studies, which makes EEW more challenging. With the optimally densified network, ground motions can be detected with higher accuracy for different classes of events, and the warning times are seen to increase for the target site as well as for several tested neighboring cities.

Even though we have provided instances of extremely effective networks, we emphasize that these are not the only solutions we have computed, meaning that more restrictions can be placed on the ideal network. This can be accomplished by either removing networks from the list of solutions if they do not fit the new requirements or by adding a penalty to the cost function and rerunning the microgenetic algorithm. Though the proposed network clearly improves upon the possibility of issuing an appropriate alarm for the target site with a longer warning time, we observe that the balance between networks’ speed and dependability is extremely delicate, and any potential network’s performance should not just be given a single value but must be thoroughly assessed.