Assessing infrequent large earthquakes using geomorphology and geodesy: the Malawi Rift

In regions with large, mature fault systems, a characteristic earthquake model may be more appropriate for modelling earthquake occurrence than extrapolating from a short history of small, instrumentally observed earthquakes using the Gutenberg–Richter scaling law. We illustrate how the geomorphology and geodesy of the Malawi Rift, a region with large seismogenic thicknesses, long fault scarps, and slow strain rates, can be used to assess hazard probability levels for large infrequent earthquakes. We estimate potential earthquake size using fault length and recurrence intervals from plate motion velocities and generate a synthetic catalogue of events. Since it is not possible to determine from the geomorphological information if a future rupture will be continuous (7.4 ≤ MW ≤ 8.3 with recurrence intervals of 1,000–4,300 years) or segmented (6.7 ≤ MW ≤ 7.7 with 300–1,900 years), we consider both alternatives separately and also produce a mixed catalogue. We carry out a probabilistic seismic hazard assessment to produce regional- and site-specific hazard estimates. At all return periods and vibration periods, inclusion of fault-derived parameters increases the predicted spectral acceleration above the level predicted from the instrumental catalogue alone, with the most significant changes being in close proximity to the fault systems and the effect being more significant at longer vibration periods. Importantly, the results indicate that standard probabilistic seismic hazard analysis (PSHA) methods using short instrumental records alone tend to underestimate the seismic hazard, especially for the most damaging, extreme magnitude events. For many developing countries in Africa and elsewhere, which are experiencing rapid economic growth and urbanisation, seismic hazard assessments incorporating characteristic earthquake models are critical.


Introduction
For characterising earthquake sources, the Gutenberg-Richter (GR) scaling law (Gutenberg and Richter 1954), which expresses a decrease in earthquake frequency with increasing magnitude, is traditionally used to extrapolate possible occurrence rates of large earthquakes from a history of small earthquakes that can be observed directly. On the other hand, in regions with large, mature fault systems, a characteristic earthquake model may be more appropriate for hazard assessment (Wesnousky 1994). As occurrence of a large earthquake involves a long strain accumulation process and our observation period of such a process is short, there is significant uncertainty regarding the estimated rate of occurrence. Moreover, high-quality earthquake data for large fault systems are not readily available for all seismic regions around the world. In this case, the size of mapped fault segments can be used to estimate the characteristic magnitude via simple scaling relationships (e.g. Wells and Coppersmith 1994;Kolyukhin and Torabi 2012;Stirling et al. 2013), and geodetic estimates of the rate of strain accumulation can be used to determine the associated recurrence interval (Reid 1911;Shimazaki and Nakata 1980). For identified faults and a given magnitude, the frequency can be inferred from estimated strain rate and thus can be incorporated into a probabilistic hazard assessment framework (Papastamatiou 1980). However, estimating characteristic magnitude and frequency of occurrence for an individual fault or fault system is a very uncertain proposition and depends strongly on assumptions (e.g. Murray and Segall 2002). One of these is particularly critical and relates to the continuity of fault segmentation (Dawers and Anders 1995;Shen et al. 2009). Testing sensitivity to different fault rupture scenarios is essential for contextualising a seismic hazard and interpreting results in the light of inherent uncertainty about physical mechanisms and our incomplete knowledge. We illustrate how the geomorphology and geodesy of the Malawi Rift, a region with large seismogenic thicknesses, long fault scarps, slow strain rates, and a short historical record, can be used to assess hazard probability levels for large infrequent earthquakes.
For East Africa, a seismic hazard study conducted by Midzi et al. (1999), as part of Global Seismic Hazard Assessment Programme (GSHAP), applied a conventional probabilistic seismic hazard analysis (PSHA) methodology by considering an available instrumental catalogue only. While the highest recorded magnitude in the Malawi region is M6.5, Midzi et al. (1999) extrapolated to M7.4 over a wider area, using a single GR relationship without uncertainties. However, geomorphological evidence in the Malawi Rift indicates strongly potential for hosting characteristic earthquakes up to M8 (Jackson and Blenkinsop 1997). Therefore, the current regional seismic hazard assessment may be significantly incomplete. To address these issues, we explore the differences between the GR and characteristic earthquake models in the light of the geomorphological evidence for large events and take account of uncertainties in fault dimensions, orientations, and segmentation to generate a synthetic source catalogue. Moreover, we investigate the impact of incorporating geomorphological information on the PSHA results for certain cities around Lake Malawi by producing regional seismic hazard contour maps, uniform hazard spectra (UHS), and seismic disaggregation plots based on both instrumental and extended earthquake catalogues.
The significance and novelty of this work are that geomorphological and geodetic information and data from the Malawi Rift fault systems are formally integrated into a synthetic earthquake catalogue that provides both a more comprehensive seismicity model for a hazard assessment in terms of magnitude coverage and a versatile resource for evaluating thoroughly the impact of catalogue properties through sensitivity analysis. The procedure demonstrated in this study is equally applicable to other seismic regions where instrumental seismicity data are insufficient to derive long-term occurrence rates of large characteristic earthquakes (e.g. Walker et al. 2008).
The East African Rift system (EARS) is situated at the plate boundary between the Somalian and Nubian Plates (Chorowicz and Na Bantu Mukonki 1980) and extends over 4,000 km from the triple junction in Afar to fault-controlled basins in Malawi (Ebinger 2005) and south through Mozambique (Fonseca et al. 2013). It is the only rift system that is currently active over a continent-wide scale (Yang and Chen 2010). The Malawi Rift lies on the southern branch of the EARS (Specht and Rosendahl 1989) and extends from Rungwe in the north to the Urema graben in the south (Ebinger et al. 1987). Figure 1a shows the seven half-graben units mapped using seismic stratigraphic analysis: Livingstone, Usisya, Mbamba, Bandawe, Metangula, Mwanjage, and Bilila-Mtakataka (Flannery and Rosendahl 1990). Relatively long fault lengths ([90 km) and wide seismogenic thickness ([30 km) in the Malawi Rift system suggest that earthquakes of M W 7.0 or greater are possible (Ebinger et al. 1987(Ebinger et al. , 1993Jackson and Blenkinsop 1997). Fault segmentation, including step offsets on the order of 1-2 km across strike, is known to halt rupture propagation and can provide a limit on earthquake size that is difficult to assess, particularly from fault maps or seismic surveys. However, studies of recent earthquake ruptures exposed on land suggest these faults rupture as single continuous units. For Fig. 1 Lake Malawi region of East Africa. a Locations of faults and major cities, and b locations of major seismic events (M W C 4) and seismic source zones example, Jackson and Blenkinsop (1997) describe a 'steep, clean fault scarp 4-15 m high that truncates spurs and is very clear on the ground, from the air and in air photographs'. This scarp can be traced, with no significant breaks, along the entire length of the fault. A single-slip event on the Bilila-Mtakataka fault (Fig. 1a), with an average displacement of 10 m over the 100-km fault and a depth of 40 km, could produce a seismic moment (M 0 ) of *10 21 Nm, resulting in a M W 8.0 earthquake (Jackson and Blenkinsop 1997).
Although no earthquakes with magnitudes greater than 7 have been instrumentally recorded in the study region, a M S 7.4 event occurred 200 km to the north-west of Lake Malawi in 1910. The Rukwa, Tanzania earthquake, is the largest historically documented earthquake in Africa and was recorded by seismographs in Europe, but a lack of azimuth coverage prevented the determination of an accurate position of the earthquake rupture (Ambraseys and Adams 1991). European settlements and missionary sites over a wide area up to 130 km away reported heavy damage and building collapses. The earthquake triggered landslides, produced seiches in Lakes Tanganyika and Malawi, and caused springs of water to dry up in surrounding areas. Subsequent aftershocks on 18 December 1910 (M S 5.7) and 3 January 1911 (M S 6.0) were also recorded. The lack of seismographs in the region still remains an issue for the EARS.
The instrumental catalogue of the Malawi Rift region records two major events with M W C 6.0: the 1989 Salima-Dedza-Mchinji earthquake and the 2009 Karonga earthquake sequence (Fig. 1b). The Salima-Dedza-Mchinji earthquake caused the deaths of nine people, 100 were injured, and 50,000 left homeless. The epicentre was close to the Malawian capital, Lilongwe, and the effects were felt in many Mozambican provinces, as well as Zambia (Gupta and Malomo 1995). The Karonga sequence occurred between 6 and 9 December 2009, when a succession of earthquakes hit the Karonga region of Malawi. Four of these earthquakes had M W [ 5.5 and five more events with a body wave magnitude (m b ) between 5.0 and 5.2 occurred (Biggs et al. 2010;Fagereng 2013). Four deaths were attributed directly to the earthquakes with 300 people injured and 1,000 houses collapsed (United Nations Office of the Resident Coordinator 2009). Although there are no examples of earthquakes [ M W 7, it is assumed that larger-magnitude earthquakes would inflict greater damage and that preparedness and knowledge of seismic acceleration of large, infrequent events is crucial in minimising risk.
Importantly, seismic vulnerability is mainly determined by two factors: (1) seismic hazard and (2) structural capacity and community resilience against earthquakes. In the case of Malawi, Mozambique, and Tanzania, large earthquakes originating from the Malawi Rift zone may induce significant ground-shaking potential, while buildings are not of a high standard in terms of seismic resistance. Low-probability and high-consequence earthquakes would affect poor and vulnerable populations in this region.

Methods
PSHA provides a convenient mathematical framework to incorporate various kinds of seismic hazard information into the assessment (Cornell 1968;McGuire 2004). It is concerned with the evaluation of likelihood of strong ground motion intensities in terms of peak ground acceleration (PGA) and spectral acceleration (SA). The outputs from PSHA are seismic hazard contour maps (i.e. plot of expected ground motion levels in space for a given probability), UHS (i.e. ordinates of expected ground motion levels for a given probability as a function of vibration period), and seismic disaggregation results (i.e. graphical representation of relative contributions due to different earthquake scenarios to overall seismic hazard). These are essential tools for developing national seismic hazard maps and modern seismic design provisions in national building codes.
One of the most important features of PSHA is a comprehensive accounting of uncertainties related to earthquake occurrence, source rupture, wave propagation, and site effects by integrating hazard contributions over all scenarios. Over the past few decades, the PSHA methodology has evolved significantly by incorporating more detailed and extended components. A modern PSHA in active seismic regions takes into account both historical seismicity based on the available regional earthquake catalogues and long-term activity rates of major faults. In regions with short historical records and little paleoseismology, the potential for large earthquakes tends to be neglected in PSHA, resulting in biased, potentially lower seismic hazard estimates.
This study focuses on extending the model for seismicity beyond historical/instrumental earthquake catalogues using information from mapped faults and geodetically determined rates of strain accumulation. We use the Malawi Rift system as an example region, and two major information sources of regional seismicity are consulted, they are as follows: (1) instrumental earthquake catalogue compiled by the International Seismological Centre (ISC), and (2) geomorphological evidence for large, infrequently rupturing fault systems (e.g. Flannery and Rosendahl 1990;Jackson and Blenkinsop 1997). We generate four separate catalogues such as: (1) an instrumental catalogue (IC); (2) an extended catalogue based on the assumption that each mapped fault ruptures in a continuous manner (CRC); (3) an extended catalogue based on the assumption that each mapped fault ruptures in segments (SRC); and (4) an extended catalogue based on a mixture of continuous and segmented rupture cases (MRC). The IC case is represented by four areal source zones ( Fig. 1b) together with the associated GR relationships. On the other hand, the CRC, SRC, and MRC cases reflect different fault rupture behaviour, in addition to the IC. The details of the earthquake catalogue generation are described in Sect. 2.1. Subsequently, other critical model components, such as selection of GMPEs and integration of hazard contributions, are described in Sect. 2.2. PSHA evaluates the expected ground-shaking intensities at different probability levels by summing contributions from all possible scenarios using the total probability theorem. This can be decomposed into three parts. The first component is regional seismicity modelling that characterises the spatial and temporal distribution of earthquakes and their magnitudes. Typically, a synthetic catalogue can be developed through Monte Carlo simulation (Musson 2000;Atkinson and Goda 2011). The second component is ground motion prediction equation (GMPE). Thirdly, summation of the hazard contributions and handling of uncertainty are achieved using a logic-tree approach. Figure 2 shows a flow chart of the seismic hazard assessment adopted in this study.

Instrumental catalogue (IC)
The ISC catalogue, containing 1,607 events (with magnitudes of 2 or above, after declustering) in the Malawi Rift region, is used to develop seismic source zones and associated GR relationships (Fig. 1b). The majority of the seismic events occurred along the fault systems identified by Flannery and Rosendahl (1990). The global CMT catalogue reports 25 focal mechanisms for the region, including nine associated with the 2009 Karonga earthquake swarm (Ekström et al. 2012). Of these, 23 are normal (rake -90°± 30°), and we fix a rake of -90°for all subsequent analyses.
Since the ISC catalogue compiles information from a variety of techniques, we sorted the sources according to priority and converted all magnitudes to M W (Scordilis 2006). Specifically, duration magnitude (M D ) values were omitted, and M W , M S , or m b were favoured, where applicable. The catalogue was declustered to remove foreshocks, aftershocks, and swarms using an approach by Gardner and Knopoff (1974). We then evaluate the magnitude completeness of the ISC catalogue (M c ) in this region and obtain a-and b-values from the GR relationship: log 10 N(Cm) = abm, where N is the annual number of earthquakes with magnitudes equal to or greater than m. The developed GR relationship is shown in Fig. 3. Inspection of the magnitude-recurrence relationship suggests the catalogue is complete since 1965 for events with M W C 4.5 for the entire region (for subareas, such as Zones 2-4, a lower magnitude of 4.0 for catalogue completeness is adopted). From an engineering viewpoint, earthquakes below M W 4.5 produce little structural damage. The same minimum magnitude was considered in GSHAP (Midzi et al. 1999). The estimated b-value of 1.02 for the whole region is relatively close to the GSHAP value of 0.929 for this source zone (Midzi et al. 1999). Due to the heterogeneous spatial pattern of earthquake occurrence, the region was divided into four source zones, to each of which is ascribed separate a-and b-values (Table 1). These values imply the expectation of an earthquake of M W 7.8 or greater every *1,250 years on average and an earthquake of M W 7.0 or greater every *200 years for the entire region (note: for Zone 1 within which the Malawi Fault systems exist, recurrence intervals for M W 7.8 and M W 7.0 earthquakes are *1,000 and *170 years, respectively). The standard deviation of the b-value is estimated using a method proposed by Tinti and Mulargia (1987) For all four zones, the minimum moment magnitude is set to 4.5 in PSHA calculations, while the maximum moment magnitude is set to 6.25, 6.5, and 6.75 with the weight of [0.25, 0.5, and 0.25] in the logic tree

Geomorphological/geological catalogue (CRC, SRC, and MRC)
We apply the approach that Jackson and Blenkinsop (1997) used to calculate the size of the potential Bilila-Mtakataka earthquake to all mapped faults of Flannery and Rosendahl (1990) (Fig. 1a). Given the fault length, L, and width, W, the scaling laws of Wells and Coppersmith (1994) can be used to estimate slip and seismic moment (e.g. Bilham et al. 1995), which is then converted to M W (Hanks and Kanamori 1979). Most normal-faulting earthquakes have a slip-to-length ratio, a, between 2 9 10 -5 and 1 9 10 -4 (Elliott 2009); we use a = 6(±4) 9 10 -5 . The fault width is controlled by the fault dip, d, and the seismogenic thickness. Considering most of the earthquakes as pure-normal, a dip of 60°is chosen (Jackson 1987). Earthquake centroid depths indicate that the seismogenic thickness around the Archean cratons of southern Africa is significantly larger than the global average, with estimates between 20 and 40 km (Craig et al. 2011). Therefore, a seismogenic thickness of 30 km was chosen. Assuming large earthquakes rupture the entire seismogenic thickness, this gives a fault width of 34.6 km. The maximum and minimum magnitudes are calculated by propagating the following errors through the calculation: 30 ± 10 km in seismogenic thickness, ±10 km in length, 30 ± 5 MPa in rigidity, and 6 ± 4 9 10 -5 in a.
The direction, /, and rate, r, of strain accumulation for each fault are calculated based on the GSRM version 1.2 model (Kreemer et al. 2003), using geodetically determined values for the Nubia and Rovuma microplates (Stamps et al. 2008). The slip vector for each fault is projected into the direction of the plate motion using the azimuth of the slip vector, h, and an occurrence interval, R, is calculated. Where two faults run parallel (e.g. Usisya and Mbamba), the rate of strain accumulation is halved such that the time-averaged rate of fault slip is equal to the plate motion  For all faults, the rupture is assumed to end at a depth of 30 km and propagate to the surface at a dip of 60°, giving a fault width of 34.6 km. C refers to an assumption that the entire fault will rupture in a single event, while S refers to shorter segments that may rupture independently. Plate motion rates, azimuths, and associated uncertainties are calculated from the microplate model of Stamps et al. (2008). a Wheeler and Karson (1989); b Contreras et al. (2000); c Flannery and Rosendahl (1990); d Jackson and Blenkinsop (1997 where F is set to 0.5 for overlapping faults and 1 otherwise. From geomorphological evidence, it is often unclear whether a fault ruptures along its entire length in a single event or in a series of smaller earthquakes. If we consider the fault to rupture completely, the magnitude of the earthquake would be greater, and the recurrence time interval longer, than for a sequence of smaller (but significant) events, each of which only ruptures a smaller fault segment. Of the seven fault systems in the Malawi Rift region, there is clear evidence that the last event on the Bilila-Mtakataka fault ruptured its entire length (Jackson and Blenkinsop 1997) and the Livingstone fault also appears to be a single, continuous structure (Wheeler and Karson 1989). However, observations suggest that the other five faults consist of segments that could rupture individually (Flannery and Rosendahl 1990;Contreas et al. 2000;Contreras and Scholz 2001). We subdivide the faults using the best available information (see Fig. 4; Table 2), and calculate the values of magnitude and recurrence interval based on two competing assumptions, continuous rupture and segmented rupture. Table 2 gives the following parameters for each of the seven continuous faults and 17 fault segments (19 including the Bilila-Mtakataka and Livingstone faults; for the base case, these two faults are treated as continuous even for the SRC case, but see Sect. 4.1 for further discussion): length, seismic moment, moment magnitude, slip, plate motion component, slip accumulation rate, and recurrence period. In order to acquire upper and lower M W values for each, uncertainties for all parameters are taken into account in the above calculations. The 'continuous rupture' catalogue (CRC) contains characteristic magnitudes in the range 7.4 B M W B 8.3 with recurrence intervals of 1,000-4,300 years, whereas the 'segmented rupture' catalogue (SRC) has values in the ranges of 6.7 B M W B 7.7 and intervals 300-1,900 years (excluding the Livingstone and Bilila-Mtakataka faults). Moreover, to account for situations that both assumptions regarding segmentation are equally possible (as a simplistic case), the two catalogues CRC and SRC are combined with an equal weight; this case is referred to as the mixed rupture catalogue (MRC).
In contrast to the GR expectation of one earthquake of M W 7.8 or greater every 1,000 years somewhere in Zone 1, our approach suggests the Livingstone fault, on its own, could host such an earthquake recurrence rate. When all the faults are considered together on the assumption of continuous rupture, an earthquake of M W 7.8 or greater would be expected every *350 years in Zone 1. For the assumption of independent segmented rupture, only the Bilila-Mtakataka and Livingstone faults would be capable of producing M W 7.8 earthquakes, but M W 7.0 or larger earthquakes would be expected to occur every 50 years with our model compared to 170 years from the GR parameters. However, it is more likely that the stress transfer from the rupture of a single segment would bring neighbouring segments closer to failure, thus initiating a sequence of events that cluster in time (Stein 1999), especially considering the 2009 Karonga sequence (Biggs et al. 2010;Fagereng 2013). A probabilistic approach, such as PSHA, gives a more appropriate estimate than a simple repeat interval calculation, but further investigations on temporal clustering of segmented faults are needed before this can be formally included in the calculation.
We convert the magnitudes of instrumentally recorded earthquakes over the last 47 years into an extension rate, using similar parameters, assuming earthquakes occurred on rift-parallel normal faults. Although the resulting rate of 0.3 mm/year is likely to be an overestimate due to geometrical complexities, it is still an order of magnitude smaller than the current plate rate. These preliminary calculations demonstrate that the GR relationship predicts a much smaller number of large earthquakes than would be expected when considering geodetic and geomorphological evidence for the Malawi Rift.

Synthetic catalogue generation
To produce a statistically meaningful seismic hazard assessment, a catalogue spanning a long period is required. We use Monte Carlo simulation (Atkinson and Goda 2011) to generate four synthetic catalogues, IC, CRC, SRC, and MRC. The simulation duration is set to 5 million years. The Monte Carlo sampling is conducted for (1) occurrence time using the GR statistics for areal sources and rupture recurrence rates for fault sources; (2) location of earthquakes by considering uniform seismicity within a zone for areal sources and on rupture segments for fault sources; (3) focal depth at discrete values, i.e. 10-30 km, with 5-km interval and with equal weight; (4) earthquake magnitude using the GR relationships for areal sources and the characteristic magnitude models where seismic moments are approximated by truncated normal distribution (truncation at mean plus/ minus one standard deviation); (5) fault plane dimensions, i.e. length, width, strike, and dip, using empirical scaling relationships. For fault sources, the variability of strike is represented as a normal variate with standard deviation of 5°, whereas that of dip is modelled by the truncated normal distribution with standard deviation of 20°and lower and upper bounds of 40°and 80°.
The CRC, SRC, and MRC contain events from the IC (8.95 million earthquakes with M W C 4.5 over the source zones are shown in Fig. 1b). Consideration of uncertainties of fault parameters in Eq. (1) leads to the evaluation of lower and upper estimates of seismic moment for a given fault system. This is used as a range of possible variation for fault rupture scenarios and is taken into account in the simulation. Specifically, seismic moment is considered as a truncated normal variate with its standard deviation equal to a half of the sum of the lower and upper seismic moments; the truncation is implemented at the lower and upper seismic moment values. The generated seismic moment value is then converted to moment magnitude. The CRC/SRC/MRC catalogues contain 14,700, 107,000, and 60,800 fault-rupture-based earthquakes, respectively. Figure 3b compares GR plots of the four synthetic catalogues for the entire region. For M W B 6, all catalogues follow the observed GR relationship (Fig. 3b) but the catalogues diverge at higher magnitudes. The IC is generated using three possible maximum magnitudes, 6.25, 6.5, and 6.75, with weights of 0.25, 0.5, and 0.25, respectively, causing a drop-off by M W 6.7. Although the SRC, CRC, and MRC all fall within the 5th and 95th percentiles of the extrapolated GR relationship, these large faults are not uniformly distributed within the entire source zone. In other words, despite the similarity between the extrapolated GR statistics based on the ISC historical catalogue and those based on the geological (synthetic) catalogues, the geological approach contains more spatial information on the distribution of potential hazard.

Ground motion models and seismic hazard calculations
The choice and weighting of GMPEs are the key components in PSHA (Atkinson and Goda 2011). In the standard PSHA, typically multiple GMPEs are used to take into account epistemic uncertainty regarding a correct median motion level for a given scenario. Each GMPE has its own parameterisation and range of applications. It is important to consider (1) regional dependence of the GMPE (Stafford et al. 2008), and (2) extrapolation of a magnitude range beyond their applicable limits (Bommer et al. 2007).
Due to the lack of strong motion instruments in eastern and southern Africa, ground motion prediction for the region is poorly constrained (Midzi et al. 1999). Jonathan (1996) and Twesigomwe (1997) attempted to develop a regional prediction model but relied on parameters that are not used in modern PSHA, such as hypocentral distance r hypo and surface magnitude M S , and underlying empirical data were not extensive. Therefore, they are excluded from our study. Alternatively, GMPEs derived for similar tectonic terrains may be used (e.g. Mavonga et al. 2010). Given the current limitation of available regional GMPEs, four well-tested GMPEs for shallow crustal earthquakes are selected: Akkar and Bommer (2010), Campbell and Bozorgnia (2008), Chiou and Youngs (2008), and Boore and Atkinson (2008). An equal weight is assigned to each equation in a logic tree. These four models take different distance measures, such as rupture distance and Joyner-Boore distance; distance measures for individual rupture scenarios are computed using detailed fault information during PSHA. The applicability to the Malawi region of GMPEs from other active areas to the Malawi region is an assumption: when strong motion data and region-specific GMPEs become available, this assumption should be critically evaluated.
Regarding local site conditions, the four models adopt the average shear wave velocity in the uppermost 30 m V S30 ; as there are no direct studies of V S30 in the Lake Malawi region, a V S30 value of 760 m/s (i.e. generic rock site) is used for this study. We compared the four GMPEs over a distance of 200 km at several magnitudes (M W 5 to M W 8) for V S30 values of 760 and 1,500 m/s and for a range of vibration periods (PGA and SAs between 0.1 and 3.0 s). The four GMPEs adopted in this study overlap for the data range having moderate earthquakes and distances between 20 and 120 km, where underlying ground motion data are relatively abundant. Typically, the data points are within ±50 % with respect to the overall average of the predicted ground motions. In the light of prediction errors associated with these GMPEs, this is a reasonable match in terms of median predictions. When the models are compared for data-scarce (large earthquake and nearsource) scenario combinations, they tend to diverge. In the data-scarce regions, differences in the median predicted ground motions can be as large as a factor of 3, which are attributed to different functional forms and parameterisations to account for physical phenomena, such as hanging-wall effects.
Finally, PSHA evaluates the expected ground-shaking intensities corresponding to different probability levels. A rational approach to achieve this is to take all possible scenarios into account and sum up the effects in the assessment by reflecting their relative contributions. Mathematically, the annual rate that a specific ground motion level is exceeded m GM (Cgm) is given by where the subscript indices i and j are for the model/assumption (1 to p) and for the source zone (1 to q), respectively; w i is the weight of a model/assumption i (note: w i sums to 1); k M,ij is the annual occurrence rate of potentially damaging earthquakes (e.g. events with magnitudes greater than 4.5); P(GM C gm|m, r) is the probability of ground motion parameter exceeding a level gm for a given scenario specified by magnitude m and distance r; f M,R,ij and X m,r,ij are the probability density function of an earthquake scenario with m and r and the scenario domain (for integration), respectively. In Eq.
(2), the first summation is related to alternative models and assumptions (e.g. different catalogues and GMPEs), while the second summation is related to multiple source zones and faults in space (such that all major earthquake sources are included in the calculation). The use of discrete models in Eq.
(2) is for convenience only (e.g. logic-tree approach); they can be replaced by the continuous probability distributions for more general cases.

Results
We compare the results for (a) different earthquake catalogues (IC, CRC, SRC, and MRC), (b) different return periods (T R = 500, 1,000, and 2,500 years), (c) different ground motion , and (d) regional scales and individual sites (major cities in the Malawi Rift region, such as Lilongwe and Mzuzu; see Fig. 1a). Large-magnitude earthquakes on the fault systems produce local, intense peaks in ground motion parameters such that the spatial resolution at which the regional PSHA is run is crucial. Thus, we discuss general patterns using the regional maps and perform a quantitative comparison between catalogues and return periods for individual sites.
3.1 Regional seismic hazard maps Figure 5 shows the regional hazard maps for the different earthquake catalogues (IC, CRC, MRC, and SRC) at different return periods (500, 1,000, and 2,500 years) for PGA. Precise values depend on the spatial resolution of the PSHA, and we use a spacing of 0.1°. The IC case for PGA and with a return period of 500 years most closely resembles the findings of Midzi et al. (1999), who used 'traditional' PSHA methodology. In general, the two sets of results agree well, but as the definitions of the source zones differ, a direct, quantitative comparison is not attainable.
Including the geomorphological evidence in the catalogue (CRC, SRC, and MRC) increases the hazard values at all three return periods (500, 1,000, and 2,500 years). The most significant changes occur in the direct vicinity of the fault systems, but with some influence out to several hundred kilometres. For example, next to the Metangula fault (13°S, 34.5°E), PGA is 0.078g for the IC and 0.259g for the SRC at the return period of 500 years, and 0.161g for the IC and 0.614g for the SRC at 2,500 years. Values for the MRC and CRC, and for intermediate return periods, lie between these values; a more detailed comparison between catalogues is given for a single site in Sect. 3.2. Results for other parameters, which are omitted for brevity, show the influence of the fault systems more clearly but the overall appearance is similar. This is because large seismic events contribute more significantly to long-period ground motions (see Fig. 6).

Site-specific seismic hazard assessment
We focus on Malawi's capital city, Lilongwe, to quantify the effects of different return periods, vibration periods, and catalogues. Figure 6a shows UHS at the return periods of 500 and 2,500 years for the four catalogue cases. At all return periods and vibration periods, including the fault-derived parameters (CRC, SRC, and MRC) increases the predicted SAs above the level predicted for the IC alone-for example, at a vibration period of 2.0 s and a return period of 2,500 years, the SA is 0.025g for the IC, but greater (0.051g) for the SRC (i.e. 100 % increase). As large-magnitude events produce seismic waves with a rich, longperiod spectral content, the effect is more significant at longer vibration periods than at shorter ones (Fig. 6a)-for example, at a vibration period of 0.1 s, the increase is only 3 % (from 0.349g for the IC to 0.358g for the SRC). The SRC achieves larger SAs than the CRC-for example, at a vibration period of 2.0 s and a return period of 2,500 years, the SRC produces the SA of 0.051g, but the CRC produces the SA of 0.043g. This is because more frequent, slightly smaller earthquakes produce higher probabilistic hazard estimates at short return periods such as 500 years than less frequent, larger events (Fig. 6a).
Next, we use seismic disaggregation to link the probabilistic estimates with specific scenarios for the MRC case. Figure 6b, c shows that, for PGA, events that exceed the threshold (i.e. hazard estimate for the selected return period) most frequently are those in the IC (dark blue), which typically have M W 4.5-6.5 and occur at distances \100 km from Lilongwe. At a vibration period of 1.0 s (Fig. 6d), the threshold is exceeded most often by segmented ruptures (SRC; yellow), which have magnitudes *M W 7.0 and occur at distances \200 km from Lilongwe. The continuous ruptures (CRC; cyan) have the greatest magnitudes (*M W 8.0) and can influence over the greatest distances (\300 km), but occur relatively rarely (1,000-5,000 years, Table 2), in comparison with the return period of interest (=500 years), such that they do not contribute significantly to the overall seismic hazard.
In detail, certain features of UHS and seismic disaggregation vary for different locations (e.g. Mzuzu and Lichinga), depending on proximity to the fault systems and on their recurrence characteristics; these effects are investigated in the following section.

Comparison of seismic hazard at major cities
It is important to assess seismic hazard for all major population centres in the region. For instance, Lilongwe is the capital of Malawi (population *780,000), but the population of Blantyre is only slightly smaller, at *730,000. Thus, in Fig. 7, we compare UHS for eight cities indicated in Fig. 1a with populations greater than 95,000. Lilongwe, Blantyre, and Mzuzu are in Malawi; Iringa, Mbeya, and Songea are in Tanzania; and Lichinga and Cuamba are in Mozambique. To illustrate the contributions of all possible scenarios and earthquake magnitudes for each city, we use the MRC, which combines the IC, SRC, and CRC into a single catalogue. The seismic hazard depends on a combination of distance Fig. 8 Relative hazard contributions from different earthquake sources for eight cities. a PGA at the 500-year return period, b PGA at the 2500-year return period, c SA at 1.0 s at the 500-year return period, and d SA at 1.0 s at the 2500-year return period from the fault systems and on location within the region of diffuse seismicity. The highest hazard estimates for all catalogues and all vibration periods are seen at the city of Mzuzu (Malawi) and at Lichinga (Mozambique), due to their positions in the middle section of Lake Malawi (Fig. 1), whereby they are close to multiple fault systems. The city of Cuamba (Mozambique) lies outside Zone 1 of the IC and thus exhibits the lowest seismic hazard, particularly at short vibration periods. The remaining cities all lie within Zone 1. Their UHS diverge at longer vibration periods as the influence of individual faults is more significant but they all are affected by the same magnitude-frequency relationship in the IC (which produces small-to-moderate events only).
We then apply seismic disaggregation to illustrate the contribution from each source zone or fault system for each city (Fig. 8). When considering PGA, the seismic hazard is dominated by small earthquakes from diffuse background seismicity, but for long-period ground motion parameters (SA at 1.0 s), the influence of the fault systems increases. Thus, use of the IC alone significantly underestimates the seismic hazard posed to urban structures (especially large, multi-storey buildings). The single largest contribution is from the Bandawe fault system, which dominates the seismic hazard at all vibration periods for the city of Mzuzu (Malawi; population *175,000) but has a limited influence on other cities. Earthquakes on the Livingstone fault dominate seismic hazard for the city of Mbeya (Tanzania; population *280,000) but their influence is limited to Mbeya and Songea alone. Earthquakes on these faults present a large threat to a single city, whereas, in contrast, an earthquake on the Metangula fault would affect almost all the cities in the Lake Malawi region. Such effects can be visualised and quantified by producing a scenario shake map. Importantly, the potential widespread disaster impacts of such an earthquake on national infrastructure and economy would greatly hamper restoration and recovery. Comparing the two largest populations centres, Lilongwe and Blantyre (both Malawi), shows that Blantyre is more susceptible to smaller earthquakes already represented in the IC, but Lilongwe faces a significant threat from large earthquakes on the Mentangula fault.

Sensitivity analysis
In regions such as sub-Saharan Africa, where the instrumental catalogue is short and detailed studies of regional tectonics, fault structures, and paleoseismology are rare, seismic hazard analysis, particularly PSHA, requires a large number of assumptions. Throughout the paper, we have included uncertainties on parameters wherever possible. Here, we test the sensitivity of our results to the basic assumptions such as: (1) the characteristic earthquake model, (2) the chosen scaling relationships, (3) fault segmentation, and (4) neglecting any aseismic contribution to plate motion.

Maximum magnitude
In this study, we have represented the major faults using a characteristic earthquake model, with associated uncertainties. However, large earthquakes, outside the historical record, could also be represented using a GR-type frequency-magnitude relationship extended to a greater maximum magnitude. To test this, we carried out additional synthetic catalogue generations and PSHA calculations by extending the M max parameter for Zone 1 (Zone 1 contains all faults considered in this study). This allows us to explore two aspects: (1) how the magnitude distribution is changed from a truncated normal distribution (i.e. characteristic model) to a truncated exponential distribution (i.e. GR relationship), and (2) how potential zones/areas for large earthquakes (M w [ 7) are extended spatially.
We consider M max values of 8.0 and 8.5 (note: the continuous rupture case can produce the maximum magnitude of about 8.2). The b-value for these two cases is the same as the default value (b = 0.97). In addition, the third case is set up by considering M max = 8.25 and a lower b-value of b = 0.91, which is still within one standard deviation of the mean. When large faults are included in the synthetic catalogue, point-source approximation is not applicable. To take into account possible finite-fault sizes and typical orientations, equations by Wells and Coppersmith (1994) are used to estimate the fault length and width, while the strike and dip are adopted from the average of the seven faults (with uncertainty).
The annual seismic moment releases over Zones 1-4 for the three cases (i.e. M max = 8.0, M max = 8.5, and M max = 8.25 with b = 0.91) are 1.29E?18, 2.34E?18, and 2.50E?18 Nm, respectively. These are similar, but slightly lower than the annual seismic  Location-specific PSHA is then carried out for four cities, Lilongwe, Mzuzu, Zomba, and Lichinga (all located relatively near to major faults), using the three IC catalogues with the extended M max values. The UHS for Lilongwe is shown in Fig. 9c, d; similar results were obtained for the three other locations. Figure 9c compares UHS at return periods of 500 and 2,500 years for different catalogue cases, and Fig. 9d shows the UHS ratios for three cases with respect to the MRC case. The results indicate that when the extended M max value (8.5) is considered, the hazard estimates are increased slightly with respect to MRC (up to about 15 %) or, for some cases, they are decreased from the hazard values based on MRC (up to about 30 %), depending on the return period level and vibration period. The increase in hazard values with respect to the MRC at shorter vibration periods is because large events are no longer constrained to defined fault zones and can occur closer to the city. At longer vibration periods, the gradual attenuation of spectral acceleration over distance means that large events at distant locations have a greater influence on hazard values, and the hazard is decreased relative to the MRC.

Scaling relationships
Throughout this paper, we have used the scaling relationships of Wells and Coppersmith (1994) assuming the earthquakes rupture the entire seismogenic zone. Alternatively, the scaling relationships of Leonard (2010) provide an empirical relationship for the width of the fault based on the fault length alone. Assuming each fault ruptures its entire length in one event, the widths lie in the range 35-52 km, which corresponds to a seismogenic thickness of 30-45 km, comparable to, but slightly higher than, our estimate of 30 ± 10 km. However, the values for the smaller, segmented rupture catalogue give widths as low as 15 km, which suggest that despite lengths in excess of 30 km, the faults do not rupture the entire seismogenic layer, or that the seismogenic layer thickness is overestimated. The corresponding moments, calculated using the scaling relationships for both width and displacement from Leonard (2010), are typically smaller than those using Wells and Coppersmith (1994) by a factor of 2-4, resulting in magnitudes smaller by 0.2-0.4 units. Propagating these differences would result in systematically lower hazards.

Fault segmentation
The processes by which earthquake ruptures stall, or propagate across multiple segments, remain poorly understood in general, and detailed information on the fault systems of Malawi is not available. Hence, we have tested the assumptions of continuous rupture, segmented rupture, and a mixed rupture case. However, we have assumed throughout that the Bilila-Mtakataka (BM) and Livingstone (LV) faults always rupture their entire length in single event. Here, we test this hypothesis by performing additional PSHA calculations with segmented fault rupture cases for the LV and BM faults. For both faults, the entire length is divided into three segments with equal lengths (see Table 2). The PSHA results for the MRC and CRC cases for Lilongwe (close to BM) and Mbeya (close to LV) are shown in Fig. 10a-d. The results indicate that the effects of segmentation for Lilongwe are minor (up to 5 % increase), whereas those for Mbeya are more significant. Thus, the impact of segmentation is complex and depends on various factors (e.g. proximity to faults, relative contributions from different faults, return period level of interest, and recurrence periods of faults).

Aseismic motion
Our calculations so far have assumed that the faults in the Malawi Rift system release all their strain as seismic energy. However, aseismic strain release is known to occur at plate boundaries including mid-ocean ridges (Solomon et al. 1988) and in subduction zones (Schwartz and Rokosky 2007). In continental rifts, aseismic deformation is mostly associated with magmatic processes, such as dyke intrusions (Wright et al. 2006;Biggs et al. 2009). Although the Rungwe volcanic province lies at the northern end of Lake Malawi, this section of the rift is considered to be immature, with extension occurring primarily by seismic slip on border faults rather than during dyke intrusion episodes (Ebinger 2005). However, if a proportion of the plate motion is taken up magmatically, the method presented here would underestimate the recurrence periods of large earthquakes and hence overestimate seismic hazard.
We carried out additional PSHA calculations for the MRC at Lilongwe, Mzuzu, Zomba, and Lichinga by considering aseismic deformation of 10 and 20 %. Specifically, recurrence rates are modified by dividing original rates by (1a) where a is the proportion of the plate motion accommodated aseismically (i.e. if we assume aseismic deformation of 10 %, then a = 0.1, and the recurrence rate of a fault segment is therefore divided by 0.9 to produce a longer recurrence period). The PSHA results for Lilongwe (UHS and UHS ratio with respect to the base case) are shown in Fig. 10e, f; similar results are obtained for the three other locations. The results indicate that including aseismic deformation decreases the hazard values slightly (up to 10 %) at longer vibration periods. With aseismic deformation included, greater impact on UHS values at longer vibration periods is anticipated as the fault sources become dominant for long-period spectral ordinates.

Influence of long-period spectral accelerations
The difference in response to earthquakes at different vibration periods is crucial in designing earthquake-resistant buildings. Even for hazard assessments based on the IC alone, this effect can be seen: for Lilongwe, PGA is more affected by smaller-magnitude, short-distance earthquakes, while for SA at 1.0 s, it is influenced by events occurring further away (Fig. 6b vs d). The inclusion of additional events based on fault mapping and plate motion data indicates that low-probability, high-consequence earthquakes do profoundly influence the seismic hazard at a site. Although it is not possible to determine whether faults rupture as short segments or along the length of entire faults, synthetic catalogues including a combination of both behaviours can be constructed and used to produce a probabilistic assessment. Current PSHA methods in the Lake Malawi region, which do not include such events, are dramatically underestimating the seismic hazard. This is highlighted for Lilongwe in the seismic hazard results for SA at 1.0 s at a return period of 2,500 years, where almost half the hazard contribution is the consequence of low-probability, large-impact events.

Wider application
This study uses the Malawi Rift to demonstrate that large infrequent earthquakes that are not represented in the short instrumental catalogue may have a significant influence on seismic hazard. However, the Malawi Rift is not the only region for which a characteristic earthquake model may be assumed. In fact, this approach is likely to be appropriate for any region with one or more of (1) a low strain rate, (2) thick seismogenic layer and/or long fault ruptures, and (3) a short instrumental record. For example, despite the almost complete absence of instrumentally recorded seismicity in the Hangay region of Mongolia, the region is cut through by numerous distributed strike-slip faults that accommodate regional left-lateral shear between Siberia and China (Walker et al. 2008). In cases such as the Altai mountains, large earthquakes are known to occur, but knowledge of strain rate is necessary to estimate recurrence intervals independent of the GR relationship (Nissen et al. 2009). The Bhuj earthquake in 2001 highlighted the high earthquake risk in the Indian region of Gujarat but focused attention away from the Himalayan arc, where there are few instrumentally recorded earthquakes; a historically recorded sequence of large earthquakes in the nineteenth century suggests that a major earthquake is overdue (Bilham et al. 2001). In each of these cases, applying a standard PSHA approach to the instrumental catalogue would lead to a significant underestimation of seismic hazard. Our approach demonstrates that additional sources of information on the properties of major faults can be incorporated in a probabilistic approach by constructing suitable synthetic earthquake catalogues that reflect geomorphological, geodetic, and other emerging forms of data.

Conclusions
Our principal conclusion is that the GR relationship, especially when based solely on a short instrumental catalogue, does not sufficiently capture seismicity potential in many tectonic settings. In areas where a characteristic earthquake model is more appropriate, geodetic and geomorphological information can be included to generate a synthetic earthquake catalogue suitable for probabilistic analysis. This approach, applied to the Malawi Rift, demonstrates that ignoring large, infrequent earthquakes tends to underestimate seismic hazard, particularly at the long vibration periods that disproportionally affect multi-storey constructions.
Although it is not possible to determine from the geomorphology alone whether a fault may rupture in a series of events on separate segments or in a single event along the whole fault, it is possible to include both possibilities within a single synthetic catalogue. We find the highest probabilistic hazard is associated with segmented ruptures, as the recurrence intervals for these events are most similar to the return period of the hazard assessment. Using seismic disaggregation to compare the seismic hazard at the two largest populations centres in the Lake Malawi region, Lilongwe, and Blantyre, we find that Blantyre is more susceptible to smaller earthquakes similar to those already recorded instrumentally, but Lilongwe faces a significant threat from a large earthquakes on the Mentangula fault.
This approach currently assumes that all plate motion is released seismically and that earthquakes occur independently, even on segmented faults. Sensitivity tests show that these assumptions affect the precise hazard values, but not that the conclusions regarding relative hazard levels are robust. Accounting fully for all these processes probabilistically will require greater understanding of stress transfer, rupture barriers, earthquake clustering, and aseismic strain release.