# CyberShake: A Physics-Based Seismic Hazard Model for Southern California

## Authors

- First Online:

- Received:
- Revised:
- Accepted:

DOI: 10.1007/s00024-010-0161-6

- Cite this article as:
- Graves, R., Jordan, T.H., Callaghan, S. et al. Pure Appl. Geophys. (2011) 168: 367. doi:10.1007/s00024-010-0161-6

## Abstract

CyberShake, as part of the Southern California Earthquake Center’s (SCEC) Community Modeling Environment, is developing a methodology that explicitly incorporates deterministic source and wave propagation effects within seismic hazard calculations through the use of physics-based 3D ground motion simulations. To calculate a waveform-based seismic hazard estimate for a site of interest, we begin with Uniform California Earthquake Rupture Forecast, Version 2.0 (UCERF2.0) and identify all ruptures within 200 km of the site of interest. We convert the UCERF2.0 rupture definition into multiple rupture variations with differing hypocenter locations and slip distributions, resulting in about 415,000 rupture variations per site. Strain Green Tensors are calculated for the site of interest using the SCEC Community Velocity Model, Version 4 (CVM4), and then, using reciprocity, we calculate synthetic seismograms for each rupture variation. Peak intensity measures are then extracted from these synthetics and combined with the original rupture probabilities to produce probabilistic seismic hazard curves for the site. Being explicitly site-based, CyberShake directly samples the ground motion variability at that site over many earthquake cycles (i.e., rupture scenarios) and alleviates the need for the ergodic assumption that is implicitly included in traditional empirically based calculations. Thus far, we have simulated ruptures at over 200 sites in the Los Angeles region for ground shaking periods of 2 s and longer, providing the basis for the first generation CyberShake hazard maps. Our results indicate that the combination of rupture directivity and basin response effects can lead to an increase in the hazard level for some sites, relative to that given by a conventional Ground Motion Prediction Equation (GMPE). Additionally, and perhaps more importantly, we find that the physics-based hazard results are much more sensitive to the assumed magnitude-area relations and magnitude uncertainty estimates used in the definition of the ruptures than is found in the traditional GMPE approach. This reinforces the need for continued development of a better understanding of earthquake source characterization and the constitutive relations that govern the earthquake rupture process.

### Keywords

Physics-based earthquake simulationseismic hazardrupture directivity3D basin response## 1 Introduction

Numerical simulations of the strong ground motions caused by earthquakes have improved to the point where it is worth investigating the predictive power of these physics-based methods in seismic hazard analysis. Southern California provides a suitable natural laboratory. Researchers working together through the Southern California Earthquake Center (SCEC) have developed community velocity models, CVM-S (http://www.data.scec.org/3Dvelocity) and more recently CVM-H (http://sger5.harvard.edu/cvm-h), that include detailed representations of sedimentary basins and other near-surface structures, which influence ground motions. Numerical simulations of anelastic wave propagation through these three-dimensional (3D) structures have been tested against the ground motions recorded by the California Integrated Seismic Network (CISN; e.g., Graves, 2008; Mayhewand Olsen, 2010), and efforts are underway to improve the CVMs using the earthquake waveform data (Chen*et al.*, 2007; Tape*et al*., 2010). The plate-boundary fault system has been well described in a Community Fault Model (Plesch*et al*., 2007), and long-term earthquake rupture forecasts based on the CFM are now available (Field*et al*., 2009).

These developments have motivated considerable research on the prediction of strong ground motions from the large, as-of-yet unobserved, fault ruptures that will someday occur. Source directivity and basin excitation effects have been studied systematically (e.g., Olsen*et al.*, 2006; Graves*et al*., 2008; Aagaard*et al.*, 2010), dynamical rupture models have been used to calibrate kinematic rupture models (Guatteri*et al*., 2004; Olsen*et al*., 2008, 2009; Song*et al*., 2009; Schmedes*et al*., 2010), and different simulation codes have been cross-verified (Bielak*et al*., 2010). One goal of this research is to improve the ground motion prediction equations (GMPEs) commonly used in seismic hazard analysis.

Ground motion prediction equations specify the conditional probability of exceeding a ground motion intensity measure at a particular geographic site for a particular source represented in an earthquake rupture forecast (Cornell, 1968). The intensity measure commonly used is SA(T), the 5% damped spectral acceleration at a frequency *T*. In probabilistic seismic hazard analysis (PSHA) terminology, a “source” is the spatial locus of a rupture, usually a delineated area of a fault surface. By combining the GMPE probabilities with the rupture probabilities of all considered sources, one can compute the unconditional probability of exceeding the intensity measure during a future time interval. A plot of this probability of exceedance, PoE, as a function of SA is called a hazard curve for the specified site. In general, the hazard curve will depend on the site location, as well as the ground shaking period and the time interval being considered. However, most GMPEs are attenuation relationships in which the site location is parameterized by a relative location with respect to the source (which depends on the fault geometry), the site conditions (often represented by *V*_{s30}, the local average of the shear velocity in the upper 30 m) and, in some cases, the local depth of the sedimentary basin (e.g., Field, 2000; Abrahamsonand Silva, 2008; Campbelland Borzognia, 2008; Chiouand Youngs, 2008). These parameterizations are determined by empirical regressions of assumed functional forms to the available data. Additionally, the application of GMPEs in the probabilistic framework typically assumes that the measured variability of ground motions (encompassing multiple earthquakes observed at spatially distributed sites over the last few decades) accurately represents the variability of motions expected at a single site over many earthquake cycles (potentially thousands of years). This is the so-called ergodic assumption, and it has been suggested that it can have an unintended upward bias in the estimated hazard level (e.g., Andersonand Brune, 1999; O’Connell*et al*., 2007).

Empirical GMPEs can potentially be improved by supplementing the direct observations of ground motions with simulation data that use the physics of wave propagation to extrapolate to unobserved conditions. For example, the simulated basin responses of Day*et al*. (2008) were used by some of the GMPE modelers in the Next Generation Attenuation (NGA) project. The simulated motions provided key constraints to the functional form and period dependence of the basin response that would not have been possible using empirical observations alone.

This paper outlines progress towards a more ambitious goal: entirely replacing the attenuation relationship with simulation-based ground motion predictions. The computational platform for such a calculation must be able to efficiently simulate the ground motions at each site for an ensemble of rupture variations. The ensemble must be sufficiently large to characterize all sources in the earthquake rupture forecast. In particular, it must be large enough to properly represent the expected variability in the source parameters—e.g., hypocenter location, stress drop, and the slip heterogeneity. In the implementation described here, called the CyberShake platform, the ground motion time series at a given site are calculated using seismic reciprocity for an ensemble of “pseudo-dynamic” rupture variations that sample the Uniform California Earthquake Rupture Forecast, Version 2 (UCERF2) (Field*et al*., 2009). Results are presented for approximately 415,000 rupture scenarios simulated at each of 250 sites in the Los Angeles region and for ground shaking periods of 2 s and longer. The entire computational process has been automated using scientific workflow tools developed within the SCEC Community Modeling Environment using TeraGrid high-performance computing facilities (Jordanand Maechling, 2003; Deelman*et al*., 2006).

## 2 Cybershake Computational Platform

For a typical site in the Los Angeles region, the latest Earthquake Rupture Forecast available from the USGS (UCERF 2.0) identifies more than 10,000 earthquake ruptures with moment magnitude (*M*_{w}) greater than 6 that might affect the site. For each rupture, we must capture the possible variability in the earthquake rupture process. So we create a variety of hypocenter and slip distributions for each rupture to produce over 415,000 rupture variations, each representing a potential earthquake. In CyberShake processing, there is a fairly technical, but important, distinction between ruptures (~7,000) and rupture variations (~415,000), which impacts our computational workflows. The ruptures as defined in UCERF2.0 can be generically thought of as faults that generate an earthquake of a certain magnitude. However, UCERF2.0 contains no information about the details of the rupture process, that is, where the rupture initiates (hypocenter) or what the slip distribution might be. In the CyberShake workflow, the SGT calculations generate Green’s functions for all the faults of interest, and then post-processing must be done in order to generate the ground motions for each individual rupture variation on each fault.

*et al*., 2006). These seismograms are then processed to obtain peak spectral acceleration values, which are combined into a hazard curve. Figure 2 contains a workflow illustrating these steps.

Data and CPU requirements for the CyberShake computational components, per site of interest

Component | Data | CPU hours |
---|---|---|

Mesh generation | 15 GB | 150 |

SGT simulation | 40 GB | 10,000 |

SGT extraction | 680 GB | 250 |

Seismogram synthesis | 10 GB | 6,000 |

PSHA calculation | 90 MB | 100 |

Total | 755 GB | 17,000 |

Once the SGTs are calculated, the ground motion waveforms for each of the approximately 415,000 rupture variations are computed. For each individual rupture variation, the SGTs corresponding to the location of the rupture (fault) are extracted from the volume and convolved with the specific rupture variation to generate synthetic seismograms, which represent the ground motions that would be produced at the site we are studying. Next the seismograms are processed to obtain the IM of interest, which, in our current study, is peak spectral acceleration at periods ranging from 3 to 10 s. Each execution of these post-processing steps takes no more than a few minutes, but SGT extraction must be performed for each rupture, and seismogram synthesis and peak spectral acceleration processing must be performed for each rupture variation. On average, 7,000 ruptures and 415,000 rupture variations must be considered for each site, which requires approximately 840,000 executions, 17,000 CPU-hours and generates about 750 GB of data.

Considering only the computational time, performing these calculations on a single processor would take almost 2 years. In addition, the large number of independent post-processing jobs necessitates a high degree of automation to help submit jobs, manage data, and provide error recovery capabilities should jobs fail. The velocity mesh creation and SGT simulation are large MPI jobs which run on a cluster using spatial decomposition. The post-processing jobs have a very different character, as they are loosely coupled—no communication is required between jobs.

These processing requirements indicate that the CyberShake computational platform requires both high-performance computing (for the SGT calculations) and high-throughput computing (for the post-processing). To make a Southern California hazard map practical, time-to-solution per site needs to be short, on the order of 24–48 h. This emphasis on reducing time-to-solution pushes the CyberShake computational platform into the high productivity computing category, which is emerging as a key capability needed by science groups. The challenge is to minimize overhead and increase throughput to reduce end-to-end wall clock time. A thorough discussion of the software tools and engineered solutions to enable CyberShake to run at scale can be found in (Callaghan*et al*., 2010).

## 3 Earthquake Rupture Characterization

Uniform California Earthquake Rupture Forecast, Version 2, utilizes the rectangularized fault definitions given by the SCEC Community Fault Model version 3.0 (CFM3). Magnitudes for ruptures of these faults are estimated using magnitude-area scaling relations. Four scaling relations are currently implemented within UCERF2: Ellsworth-B (WGCEP, 2003), Hanksand Bakun (2007), Somerville (2006) and Wellsand Coppersmith (1994), although currently only Ellsworth-B and Hanks-Bakun are given non-zero weights. For magnitudes larger than about 7, the Ellsworth-B and Hanks-Bakun relations predict magnitudes about 0.2 units larger for the same fault area compared to Somerville and Wells-Coppersmith. While this has little impact on calculations utilizing GMPEs, we have found that this has a significant impact on physics-based simulations. The 0.2 unit increase in magnitude corresponds to a factor of 2 increase in seismic moment. At long periods and for a fixed fault area, the ground motion amplitudes of the numerical simulations scale almost directly with increasing seismic moment (*M*_{o}). However, the GMPE has a built-in magnitude-area relation that implicitly adjusts the area for the prescribed magnitude. Thus, the ground motions for the empirical model scale more like *M*_{o}^{1/3} as the magnitude is changed.

*M*

_{w}, the rupture model is generated in the wavenumber domain by constraining the amplitude spectrum of slip to fit a K

^{−2}falloff with random phasing (Somerville

*et al*., 1999; Maiand Beroza, 2002). The slip velocity function is constructed using two triangles as shown in Fig. 4, with the rise time scaling with increasing magnitude (Somerville

*et al*., 1999). For each fault, multiple hypocenters and slip distributions are considered. Hypocenters are placed every 20 km along strike and two slip distributions are run for each hypocenter. The current implementation only considers median values for rise time and rupture velocity (80% of local

*V*

_{s}). Figure 4 displays representative rupture models for 3 different magnitudes.

## 4 Verification

For each site of interest, we first compute a full set of SGTs for all ruptures within 200 km of the site. The SGTs are calculated via reciprocity within the prescribed 3D velocity structure using a parallelized anelastic FD algorithm (Graves, 1996; Gravesand Wald, 2001). We set the minimum shear wave velocity at 500 m/s and, using a grid spacing of 200 m, we obtain a maximum frequency resolution of 0.5 Hz. Two calculations are required, one each to obtain the SGTs for each of the two orthogonal horizontal components of motion. Each fault surface is sampled at a 1 km spacing, and the SGTs are saved for each point on each of these fault surfaces. In total, there are about 420,000 point SGT locations, and the resulting set of SGTs requires about 40 GB of storage for each site.

*M*

_{w}7.8 rupture of the San Andreas fault using both approaches. The agreement between the two calculations is excellent, with the very slight differences in this case due to small differences in the spatial discretization of the forward and reciprocal models.

## 5 Hazard Curves

*V*

_{s30}(the travel-time averaged shear wave velocity in the upper 30 meters) to account for site response effects. Additionally, the Campbell-Bozorgnia relation incorporates a basin response effect based on the depth to

*V*

_{s}= 2.5 km/s beneath the site (referred to as Z2.5), which can have a significant impact at deep basin sites for the longer periods. Rupture directivity effects are not explicitly included in these GMPEs. However, all of these effects are naturally included in the CyberShake results through the use of the 3D velocity structure for the ground motion simulations. The hazard curves were generated using the resources and applications of OpenSHA (Field

*et al*., 2003, http://www.opensha.org) to combine the ground motion amplitudes calculated by CyberShake with the rupture probabilities specified in UCERF2.0. The PAS site can be regarded as a “rock” site with a relatively high

*V*

_{s30}of 748 m/s and a relatively shallow Z2.5 of 0.31 km, whereas the USC, WNGC and STNI sites are “basin” sites with relatively low

*V*

_{s30}of 280 m/s and a thick accumulation of soft sediments (Z2.5 ranges from about 3–6 km).

At the rock site (PAS), both the CyberShake and GMPE approaches produce similar results, whereas for the basin sites, the hazard levels produced by CyberShake are at or above the GMPE results. At USC, which has a modest basin depth (Z2.5 of 3.9 km), the CyberShake curve tracks the CB08 curve quite closely, with both being at a somewhat higher level than BA08. A similar trend is seen at STNI; however due to the much greater basin depth (Z2.5 of 6.0 km), both CyberShake and CB08 predict much higher hazard levels than BA08. At 3 s period, the basin amplification term of CB08 increases the median ground motion level by about 25 and 75% for Z2.5 of 3.9 and 6.0 km, respectively. The GMPE of BA08 implicitly incorporates basin response effects through the *V*_{s30} site response term; however, this approach cannot distinguish between sites of different Z2.5 with the same *V*_{s30}, such as the case considered here. The similarity of the hazard levels produced by CyberShake (which naturally includes basin response effects) and CB08 (which has an explicit basin amplification term) for the USC and STNI sites demonstrate the potential importance of deep basin sediments on long period hazard levels.

In addition to basin response, the CyberShake motions can be further amplified by rupture directivity effects. This is particularly evident at the WNGC site which has a modest basin depth (Z2.5 of 2.81 km), but stills exhibit relatively high ground motion hazard from the numerical simulations. Previous TeraShake (Olsen*et al*., 2006, 2008) and ShakeOut (Graves*et al*., 2008) ground motion modeling studies have shown that this site is susceptible to channeling and amplification of basin waves for larger ruptures on the southern San Andreas fault. This amplification effect represents a coupling of rupture directivity and basin response which cannot be accounted for using the existing GMPE parameterizations. At this point, it is not entirely clear whether this coupling of directivity and basin response might occur at other sites in and around the Los Angeles basin region. However, the development of hazard maps using the CyberShake approach will allow for the systematic analysis of this effect and the identification of additional sites that are susceptible to this type of ground motion amplification phenomenon.

## 6 Disaggregation

*M*

_{w}6–7.5) events and more distant large magnitude events (

*M*

_{w}> 7.5) on the San Andreas fault. The fact that CyberShake reproduces this pattern is a key validation of the methodology. The main differences between CyberShake and the empirical results are (1) CyberShake shows a higher contribution from the nearby sources compared to the San Andreas events and (2) CyberShake shows significant contributions for negative epsilon values, particularly for the nearby sources.

For the STNI site, the main contributing nearby sources are ruptures of the Newport-Inglewood and Palos Verdes fault systems. Both of these fault systems lie south of the site, and generally form the south-western margin of the Los Angeles basin. Because these faults are immediately adjacent to the basin, they are particularly efficient at channeling energy into the basin as the rupture propagates along the shallow portion of the fault. This coupling of fault rupture and basin response leads to larger ground motions for these ruptures, and thus increases their percent contribution to the overall hazard compared to the San Andreas ruptures. Furthermore, this type of coupling is not possible to capture using the current set of empirical GMPEs.

Since the CyberShake calculations are explicitly site-based, the variability in ground motions estimated at a given site is controlled directly by variations in the source characterization model (hypocenter location and slip distribution) and deterministic wave propagation effects. This direct approach alleviates the need for the ergodic assumption that is implicit in traditional PSHA (Andersonand Brune, 1999; O’Connell*et al*., 2007). For moderate magnitude ruptures and for long ground shaking periods (e.g. SA at 3 s and greater), the contribution of source variability to ground motion variability at a given site is relatively small. Hence the estimated distribution of ground motions for these ruptures is rather narrow, or stated in another way, the ground motions have a have a relatively low sigma compared to that predicted by empirical GMPEs. There is mounting evidence to support the idea that ground motion variability expected at a particular site for a particular set of ruptures is significantly less than would be estimated using the full sigma of the GMPEs (e.g. Atkinson, 2006), which is consistent with the CyberShake results presented here.

## 7 Cybershake Hazard Maps

To produce a CyberShake hazard map, we first compute the residuals between the CyberShake prediction with those obtained from an empirical GMPE approach at each site. We then construct an interpolated version of this residual field that covers the region of interest. Finally, we construct the CyberShake map by adding the interpolated residuals to the original GMPE based map. An example of this process is illustrated in Fig. 9 using the GMPE of Campbelland Bozorgnia (2008). For this demonstration, we use 3 s SA and an annual exceedance probability of 2% in 50 years. The resulting CyberShake map shows generally elevated hazard for many of the deep basin sites, and a generally reduced hazard level along the San Andreas fault.

## 8 Conclusions

The SCEC CyberShake project has developed an approach for implementing physics-based waveform simulations in seismic hazard calculations. The advantage of the physics-based approach over the GMPE approach is that deterministic earthquake rupture and wave propagation effects are explicitly included in the ground motion response. Furthermore, since CyberShake is explicitly site-based, there is no need for the ergodic assumption when computing probabilistic hazard estimates. That is, the predicted variability in ground motions at a given site directly includes all of the 3D path and rupture characteristics specific and unique to that site for the prescribed earthquake rupture forecast. Thus, there is no need to utilize ground motions recorded in one region for application in another, or to assume that the variability of ground motions observed over a spatially distributed set of sites can be used as a surrogate for the temporal variability of ground motions expected at a given sites over many earthquake cycles. Application of the CyberShake approach requires significant computational resources which have been made available through the SCEC Community Modeling Environment (http://www.scec.org/cme). Currently, the CyberShake wave form calculations are band-limited to ground shaking periods greater than 2 s (providing spectral accelerations for periods of 3 s and greater). The restriction to longer periods is primarily due to computational limitations, and is not an inherent limitation of the physics-based methodology. As the methodology is further developed, we intend to push the calculations to shorter periods. Our preliminary results demonstrate this approach is viable and practical.

Incorporation of physics-based ground motions within a probabilistic framework requires careful consideration of how the earthquake ruptures are characterized. The current UCERF2.0 characterization uses an average of two magnitude-area scaling relations (Ellsworth-B and Hanks-Bakun), which systematically underestimate fault areas compared with physics-based rupture model inversions for event magnitudes larger than about 6.7, particularly for strike-slip faults (Somerville, 2006). This has little consequence on the traditional GMPE based hazard calculations because the GMPEs do not explicitly consider fault rupture area (or static stress drop) in determining ground motion levels. However, the physics based approach is quite sensitive to the magnitude-area scaling, because both magnitude and rupture area are required to fully characterize the rupture. Thus, using a fault area that is too small requires a corresponding increase in slip to preserve the target magnitude (seismic moment) in the physics-based simulations, which scales almost directly into ground motion amplitude at the longer periods. To circumvent this problem, we have modified the original UCERF2.0 fault descriptions by extending their down-dip widths such that the resulting fault rupture areas correspond, on average, to the Somerville (2006) scaling relation. Validation tests indicate that the modified rupture descriptions provide a much better match to recorded ground motion levels than the original descriptions.

The range of prescribed magnitude variability for the characteristic ruptures in UCERF2.0 is about 0.7 units or larger for a given fault rupture area. Since the fault rupture area is held fixed in these characteristic ruptures, the average ground motion levels in the numerical simulations scale almost directly with seismic moment. This 0.7 unit magnitude range corresponds to a change in seismic moment of over a factor of 10. However, as described above, rupture area is not a parameter that is used directly by the GMPEs. Thus, the GMPE implicitly adjusts the area (to maintain a constant median static stress drop) when the magnitude is changed, and consequently the ground motion levels predicted from the GMPEs are much less sensitive to changes in magnitude, scaling roughly with seismic moment to the one-third power. The magnitude range of 0.7 units produces a variability in the median ground motion levels predicted by the GMPEs of only about a factor of two. This raises two important issues with respect to magnitude characterization the ERF. First, the strong sensitivity of the numerical simulation results to magnitude variability for a constant rupture area (combined with the relative lack thereof for the GMPEs) suggests that the prescribed range of magnitude variability defined by UCERF2.0 needs to be carefully examined. Second, it is possible that the large range of UCERF2.0 magnitude variability coupled with the use of GMPEs may result in a double counting of this effect due to its incorporation within the uncertainty estimates (sigma) of the GMPE.

## Acknowledgments

Funding for this work was provided by SCEC under NSF grants EAR-0623704 and OCI-0749313. Computational resources were provided by USC’s Center for High Performance Computing and Communications (http://www.usc.edu/hpcc) and through NSF’s TeraGrid Science Gateways program (http://www.teragrid.org) using facilities at the National Center for Supercomputing Applications (NCSA), the San Diego Supercomputer Center (SDSC) and the Texas Advanced Computer Center (TACC) under agreement with the SCEC CME project. This is SCEC contribution 1426.