Analysis and characterisation of extreme wind gust hazards in New South Wales, Australia

Extreme wind gusts cause major socioeconomic damage, and the rarity and localised nature of those events make their analysis challenging by either modelling or empirical approaches. A 23-year long data record from 29 automatic weather stations located in New South Wales (eastern Australia) is used to study the distribution, frequency and average recurrence intervals (ARIs) of extreme gusts via a peaks-over-threshold approach. We distinguish between gust events generated by synoptic phenomena (e.g. cyclones and frontal systems), hereafter called “synoptic events”, and convective phenomena (i.e. thunderstorms), hereafter called “convective events”, using the wind time series. For synoptic events the frequency of gusts >25\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$>25$$\end{document} m/s decreases systematically inland from the coast, in contrast to convective gusts which are more uniformly distributed geographically and occur more often than synoptic gusts at nearly all inland locations. At inland locations the most extreme wind gusts are likewise dominated by convective events, whereas at coastal stations both gust types have similar intensities at low ARIs but convective events again dominate at the highest ARIs. Extreme gust directions were found to be predominantly westerly at inland locations and southerly at coastal ones, with more variable direction for convective than synoptic events. This study confirms the dominant role of thunderstorms in producing the most extreme gusts in the region, and shows that wind risk varies strongly with distance from the coast.


Introduction
Wind gusts are rapid changes of wind speed or short-duration speed maxima. They are defined by the Australian Bureau of Meteorology based on the maximum 3 sec-average wind speed, measured 10 meters above ground. Strong wind gusts can cause significant structural damage, including the destruction of infrastructure such as power transmission lines, wind and solar farms and homes (Leaning and Guha-Sapir 2013;World meteorological organization 2014;Loredo-Souza et al. 2019;Solari 2020;Zhou et al. 2018). For example, more than 25,000 homes were without power across the city of Perth in Western Australia due to a strong storm with wind gusts of more than 100 kms per hour that knocked trees and branches onto power lines and caused damage across the city in October 2019 (Bureau of Meteorology 2019). Historical catastrophe data from the Insurance Council of Australia shows more than 20 billion Australian dollars lost (normalised to 2017) to catastrophes related to strong storms generating extreme winds from 2016 to 2020 (Data Hub 2022). Wind gusts can also be hazardous to aircraft (Soekkha 1997) and intensify wildfires (known as bushfires in Australia) (Potter 2012).
Strong wind gusts in regions around the world can be associated with storm types such as tropical cyclones (Stern et al. 2021), extra-tropical cyclones (Paulsen and Schroeder 2005), fronts, and convective systems (i.e. thunderstorms) (Solari 2020), including combinations of these storm types for compound events (Dowdy and Catto 2017). Whilst cyclones can sustain strong winds for hours or days, convective systems appear to be the main source of extreme gusts above 25 m/s in many regions (Geerts 2001;Holmes and Moriarty 1999;Holmes 2002). Even the strongest gusts in some extra-tropical cyclones can arise from embedded mesoscale convective systems (Earl and Simmonds 2019) which combine with the synoptic flows (Holland et al. 1987;Hopkins and Holland 1997;Sanabria et al. 2011). Strong wind gusts in thunderstorms can be generated from sub-saturated downdrafts hitting the surface and then spreading out, generating a gust front (Stull 2016). The arrival of a gust front can lead to a sudden and dramatic increase in wind speed near ground level, and gust fronts can form in conjunction with synoptic ridges and fronts (Gibson 2007). Wind and gust production in severe storms involve convective and mesoscale dynamical processes that are less studied and less well understood than the larger-scale flow features such as synoptic fronts, which can be better captured by observations and resolved by modern global weather simulations.
The study of strong wind gusts and their characteristics have historically relied on observations, noting that periods of high-quality observations may be limited and weather stations can be sparsely distributed. Moreover, careful consideration of data quality issues is needed for wind gusts, including biases that can change the shape of the probability density function (pdf) of those events and affect statistical methods such as those used for long-term climatological analyses. Bias changes can be related to station relocations, instrument changes (for example Dines anemometers were changed to cup anemometers in the 1990s in Australia), measurement height changes, and missing or implausible values can affect results. Previous studies have applied duration correction factors to consider the difference in gust speeds given by different anemometer types, but not many studies showed reliable techniques to identify inhomogeneities in extreme wind time series. Recently, Arizon-Molina et al. (2019) presented an approach to homogenise large daily peak wind gust time series including anomaly correction and data infilling, and used this to study long-term trends for daily peak wind gust data in 1 3 Australia (Azorin-Molina et al. 2021). The authors observed a decline in the magnitude of gusts between 1941 and 2016, but did not focus on extreme wind gusts.
Indeed, in spite of the significant hazards associated with the most extreme gusts, there are few empirical studies of them in Australia. Geerts (2001) described the regional climatology of strong wind gusts generated by thunderstorms using data from 10 stations in New South Wales (Australia), and analysed the efficiency of sounding-based indices in estimating the strength of microbursts. Holmes and Moriarty (1999); Holmes (2002) presented an analysis of extreme winds in four major cities in Australia (Adelaide, Sydney, Melbourne, Perth) using the peaks-over-threshold approach using observations up to 1998 to support the Australia/New Zealand Standard for wind actions (Standards Australia 2002Australia , 2011. Sanabria and Cechet (2009) used a Monte Carlo model to produce synthetic wind gust data that were analysed using the generalised Pareto distribution considering data up to 2005 from three stations in NSW. Wang et al. (2013) presented a statistical analysis (based on extreme value theory and generalised Pareto distribution) of extreme gusts using data between 1939 and 2007 from 545 stations across Australia to predict extreme gusts under current and future climate changes. More recently, Brown and Dowdy (2021) investigated trends in severe convective wind (SCW) environments, corresponding to wind gusts above 20 m/s with lightning observed nearby.
The design criteria for buildings and other structures are usually stated in terms of average recurrence intervals (ARIs) (Church et al. 2006;Brabson and Palutikof 2000). For most buildings the ARI is taken to be 500 years, whilst for some major structures (e.g. school buildings) the ARI is taken to be 1000 years, and for post-disaster buildings, 2000 years (Wang et al. 2013). Prediction of such long recurrence intervals for extreme gusts requires statistical and probabilistic methods that can extrapolate beyond the available time span of the observations. Extreme wind gust occurrence frequencies are sometimes modelled by the block maxima approach, in which the time series is divided into blocks and the maximum over each block is modelled using extreme value theory (EVT) (Gumbel 1958;Palutikof et al. 1999;Coles 2001). This approach has some shortcomings related to the fact that it ignores all but one of the extreme events within a block, thus sometimes inefficiently using the available information. The Peaks-over-Threshold (POT) approach is another method where all data above a sufficiently high chosen threshold are modelled as a Generalised Pareto distribution (GPD) (Coles 2001;Pickands 1975;Holmes and Moriarty 1999). This method has the advantage of increasing the sample size and hence reducing the sampling uncertainties compared to the block maxima approach. The GPD approach is used in this study to fit the tail distribution of extreme wind gust events for 29 different Automatic Weather Stations (AWS) located across NSW, for available data up to the end of 2021, which were obtained from the Australian Bureau of Meteorology.
Since extreme gusts generated by different mechanisms can follow different probability distributions, extreme gust events are separated here into different associated storm types, as done in Holmes (2002); Spassiani and Mason (2021) (see Sect. 2.2. Whilst the in situ data are the most accurate and reliable gust measurements, a limitation of using them is that many gusts will be missed even within areas of relatively good sampling. Our study goal is not to count every event but to determine the occurrence statistics at the locations where data are available, from which statistics at other locations could in principle be estimated via further modelling. This study aims to complement the limited available research related to extreme wind gusts, considering south-east Australia as a region of interest. Our analysis provides a further understanding of the properties of extreme wind gusts generated from synoptic phenomena and the more challenging convective events, which is important for improved guidance and planning to help minimise direct losses and for appropriate design of buildings and infrastructure. We investigate the frequency and average recurrence interval of extreme gusts, whilst also showing the variations of these characteristics between different geographic locations, including inland and coastal ( < 10 km from the coast) regions. We also outline and validate a carefully designed threshold selection algorithm that provides an optimum threshold for the GPD model, which may reduce the subjectivity in choosing the threshold value.
This paper is organised as follows. Section 2 provides details on the observational data, statistical approach, threshold selection algorithm, and storm classification technique. Section 3 presents results concerning ARIs, exceedances and wind directions of extreme gusts. The conclusions of the study are presented in Sect. 4.

Automatic weather station (AWS) data
This study uses daily maximum gusts. Standard 1-min AWS operational station data have been provided by the Australian Bureau of Meteorology for 29 locations across NSW. The BoM AWS data provide reliable information on wind gust events, including the 1 min mean wind speed and direction and the strongest 3 sec gust observed during the minute. The maximum of the latter over each 24 h (1440 min) period was then used for analysis. The 1 min data also include the air temperature, dew point temperature, and mean sea level pressure. However, relatively few stations have long periods of data available (e.g. relatively few AWS stations were deployed in the 1990s, with increased deployment over time). The candidate stations were chosen as the locations where good quality and relatively long data periods were available, including stations based on those used and analysed in Arizon- Molina et al. (2019). From the available stations currently operational with 1 min wind gust data, we selected for the analysis only stations having at least 13 years of data with a maximum data period of 1998-2021.
Data quality flags already included in the data provided from BoM were considered whilst performing quality checks, and incorrect data (flagged as wrong in the original dataset) were removed. Moreover, all wind speed time series were manually assessed and unrealistic data removed. The discarded values correspond to very high isolated peaks in the wind gust velocity time series that do not correspond to any signature (in terms of temperature, pressure and wind direction) of thunderstorm-or synoptic system-generated gusts (as shown in Fig. 2). The BoM severe storm archive and other available online sources (e.g. the BoM website) in addition to radar data were also cross-checked to identify spurious events. Furthermore, we checked for uncertainties related to station relocation by looking at the relocation information from the BoM database for all the AWS stations used. While some of the stations were relocated during the analysis period, we did not find any evident break points in the daily peak wind gust speeds. Corrections for terrain exposure were not applied as most of the locations considered for this study are expected to have a uniform terrain immediately surrounding the instrumentation. Furthermore, there is no appropriate roughness correction model for microbursts/macrobursts and the general wisdom is that surface roughness affects gusts generated by convective systems to a lesser extent than synoptic flows, thus applying roughness correction for convective events may not be beneficial.
To obtain a longer data record would require using data from the previously installed Dines pressure tube anemometers in addition to the data from cup anemometers used at AWS stations. This would make the quality control and homogenization of the data more challenging and would require us to perform corrections on the wind gust speed, as the cup anemometers reduce the gust speed compared to what is measured using Dines anemometers (Holmes and Ginger 2012;Arizon-Molina et al. 2019). Considering the trade-off between quality and data length we chose the good quality AWS data with reasonable data range, whilst noting that uncertainty still exists due to limited periods of high-quality AWS observations such that results should always be interpreted accordingly (Allen et al. 2011). We try to mitigate the concern of sampling uncertainty at individual stations by looking at a sufficient number of stations such that if a consistent result or pattern is found across many stations, it is not an artifact of sampling uncertainty. Fig. 1 shows the locations of the 29 AWS stations used and the reader is referred to Appendix A for more details on the selected stations.

Storm classification
Since extreme wind gusts generated by different mechanisms can follow different distributions (i.e. different physical processes can result in different statistical characteristics in data), it can be helpful to consider the causes of the gusts, such as the different storm types that can generate strong wind gusts. The process of identifying storm types and assigning individual gusts to them is challenging, and the development of more reliable methods is ongoing.
Some storm separation methods are based on the classification of events belonging to very large wind datasets, without giving detailed meteorological descriptions of the events due to the prohibitive cost of that process when using long data series. For example, (Spassiani and Mason 2021) proposed self-organising maps that utilise wind gust speed, temperature and pressure data to automate the classification of convective and non-convective events. Holmes (2019) classified wind gusts as synoptic or non-synoptic based on the ratio of the wind gust speed (during the event) to the mean wind speed during the 2 h time frame either before or after the event time. This method is similar to what has been used by De Gaetano et al. (2014) who also used wind time series to classify different gust events. Other approaches focused on understanding storm types from a weather event perspective (e.g. considering specific individual case studies) ). Yet another method is to investigate extreme events from a weather system topology perspective, similar to the work of Catto and Dowdy (2021) who systematically linked compound hazards (extreme precipitations, winds and waves) to the associated weather system type using reanalysis and lightning observations.
Our approach is based on using large wind datasets at multiple locations to separate convective and synoptic events. This is similar to previous methods based on temporal changes in wind gust speed that have been developed to help indicate the system generating the wind gusts, with wind gusts classified here into synoptic and convective events using the wind gust ratios as in Holmes (2019); De Gaetano et al. (2014). The time step of the AWS observations (1 min) is high enough to reliably estimate the wind gust ratio that is defined as follows: • If r 1 = V G ∕V 1 < 2.0 and r 2 = V G ∕V 2 < 2.0 then the event is classified as synoptic.
• Otherwise, the wind gust is considered as convective.
V G is the wind gust speed at the time of the event (which represents the daily peak wind gust), V 1 is the mean wind gust speed in the 2 h time block before the event and V 2 is the mean wind gust speed in the 2 h block after the occurrence of the event. Note that anywhere both types of gust event can occur there will be some events with intermediate properties, and one could consider more than two event types. However, we stick with the binary classification as the data range is short and considering more than two types would result in categories with very short data and an unreliable fit.
Weather radar imagery of some of the top gust events was also checked to aid in the validation of this method. We found that the most extreme convective gusts were associated with strong radar reflectivity in which the recorded gust event was located at the edge of a storm cell. Events characterised by a sustained wind speed, direction, temperature and pressure did not have strong radar echo as thunderstorm events. The classification criteria are illustrated in Fig. 2 considering two different events classified as synoptic vs. convective at Sydney Airport station. The convective event (Fig. 2a, c) presented a distinct increase in the wind gust speed at the time of the event and a noticeable change in wind direction and quantities like temperature and pressure. Furthermore, both wind gust ratios were above the 2.0 threshold. The synoptic event (Fig. 2b, d) showed a more uniform gust speed over the period of extraction, and no sudden change in meteorological properties at the time of the event. Moreover, the wind gust ratios had values less than 2. The wind gust ratio criterion was implemented and used to classify all the daily peak wind gust events at all the stations and for all the available time range.
The classifier used here performs very well in classifying events that are characterised by the signatures discussed above and presented in Fig. 2. However, the performance of this approach decreases when considering events that have intermediate properties, such as: Gusts that are characterised by a dominant peak in the gust speed but no change in other meteorological quantities, or events that have a sustained wind gust speed but large changes in other quantities during the event. The classification of such events is tricky without performing a detailed meteorological description of each event separately. This becomes complicated when using large time series and many stations. The number of daily peak wind gust events of each type at each station for all the duration of the data period is given in 4.

Generalised Pareto distribution and average recurrence interval
The GPD approach uses all the values in a dataset that are higher than a carefully chosen threshold u 0 . The cumulative distribution function of the exceedances is given by: where and represent the shape and scale parameters, respectively, and u is taken as the gust speed. This expression is defined on { u − u 0 > 0 and (1 + (u − u 0 )∕̃) > 0 } where ̃= + (u − u 0 ) . A positive shape parameter corresponds to an ordinary (unbounded) Pareto distribution, = 0 corresponds to an exponential distribution and < 0 is a Pareto type II distribution which implies a bounded tail. The latter is appropriate for modelling wind speeds which are naturally bounded phenomena. The shape and scale parameters are obtained here by fitting to the available data using the maximum likelihood (ML) approach  (Davison and Smith 1990). The reader is referred to the book by Coles (2001) for more details on the GPD theory.
An important property of GPD is that for < 1 , the conditional mean exceedance (CME) is linear and defined as: The linearity property can help in indicating the appropriateness of the GPD model and can guide the choice of the threshold value u 0 , although this choice can still be subjective and requires using further supportive criteria. More details on the threshold selection algorithm are given in the next section.
The gust speed corresponding via the above relations to a specified ARI R is given as: where is the crossing rate (i.e. the number of events exceeding the threshold per year).

Threshold selection
The selection of an appropriate threshold u 0 is a critical step in fitting the data using a GPD distribution. Choosing a higher threshold increases the sampling error as fewer data points are selected. Low threshold values on the other hand result in the distribution not following the GPD form as many samples could be dependent, which violates one of the basic assumptions of the GPD distribution that assumes the exceedances of the extremes to be independent (Coles 2001). Moreover, Sanabria and Cechet (2009) performed a sensitivity analysis of the ARIs to the choice of the threshold value, considering the daily peak wind gust speeds for the study, and showed that low thresholds result in low and flat return level curves. This means that the wind gust intensity may be underestimated at long recurrence intervals using low threshold values. Furthermore, one of the issues with picking too low a threshold is that the true distribution will not exactly obey the simple GPD form and the distribution will be different than fitting only the extreme tail (even with an infinitely large data sample of independent events). As the extreme tail is the most important part for hazard-related purposes, that is where the fit is most desired to be accurate, not in the bulk of the distribution. Furthermore, given that the data range is not the same between all stations (having stations with shorter time periods), pre-fixing the same threshold value for all the considered locations to apply the GPD analysis may induce a higher sampling uncertainty at stations with short time range since fewer data points would exceed the pre-fixed threshold compared to longer data ranges at other locations. Based on that we developed an algorithm to iteratively select an optimal u 0 at each station individually. Our use of one maximum gust event per day (Sect. 2) ensures that all considered events are independent (and especially avoids sampling the same wind gust at two different times). The algorithm first uses the classification approach described in Sect. 2.2 to separate the storms into different types and then the thresholds are selected separately for synoptic and convective events. The shape parameter is calculated iteratively from a range of threshold values and only thresholds that give at least 20 exceedances are tested. As wind speed is a bounded phenomenon and therefore return speeds are desired to converge to an upper limit (see (Coles 2001)), the difference between return speeds at high average recurrence intervals should not be high. For example, if the relative error between gust speeds at R = 1000 and R = 10000 is greater than 10%, the threshold is not considered as a good candidate for the fit. Furthermore, if the generalised Pareto distribution is a good model for the excesses over a certain threshold u 0 , the estimated shape parameter should remain roughly invariant over all thresholds higher than the one chosen (within sampling uncertainty). Based on that the algorithm selects the GPD curves that are close to the mean stable shape factor. It should be noted here that we checked for trends in the data to consider the implications of the stationarity assumptions of the GPD; however, no statistically significant trends were found and hence the data were treated as stationary.
The threshold selection model is then validated using quantile-quantile comparisons and ARI plots for each station. Figure 3 represents a validation example showing data from Sydney Airport station. The threshold selected for this station is ∼20 m/s based on the criteria mentioned above. It should be noted that the threshold for synoptic and convective events may not necessarily be the same, given their different causal mechanisms. Figure 3a and b show that the fit is a very good match to the empirical data, indicating that the Fig. 3 Quantile quantile plots for a synoptic and b convective events considering data from Sydney airport station. b and c are plots of return periods for both wind gust types. The dashed line in (b) and (c) represent the 95% confidence limits 1 3 threshold selection and the corresponding GPD distribution is a good model of the data. The return levels plots (Fig. 3c and d) also show a very good match with empirical return levels for both synoptic and convective events which gives an extra element of confidence to the results. Table 1 shows the thresholds selected at each station along with the shape and scale factors calculated for the distribution. The absolute value of the shape factor has an overall increase from ∼ 0.1 to ∼ 0.3 for synoptic events and from ∼ 0.1 to ∼ 0.2 for convective gusts when going from coastal to inland locations. This means that the probability of occurrence of extremes is the highest at coastal locations, since a smaller (in absolute value) shape factor induces a less concave return level curve and hence a higher probability of occurrence. This increase is more clearly observed for synoptic events compared to convective gusts that are relatively more uniformly distributed across the state. This observation will be confirmed in the analysis of the frequency of extremes presented later in the paper. For design purposes, a shape factor of −0.1 is adopted in the Australian standards for structural design (Standards Australia 2021) for locations in Region A (covering areas not affected by tropical cyclones) and based on the current analysis, this value is good enough to predict suitable design wind speeds for most of the locations adopted in this study. Yet some locations have 500-year gust speeds higher than 40 m/s (e.g. 43 m/s for convective events at Newcastle and 43.7 m/s for convective events at Bellambi), whilst noting the range of uncertainty (95% confidence limits) given in Table 2. If corrections are applied to compare with the 0.2 s moving average results of the standards, there is a possibility that some of these locations are at more risk of destructive winds. An important step to support the results of the statistical approach presented in this section is the quantification of the uncertainty, as long-term predictions are calculated based on a short time range. For that purpose, the 95% confidence limits were calculated for all stations using bootstrapping techniques similar to what has been done by Holmes (2019).

3
The confidence limits are represented by the dashed lines in Fig. 3c and d. The uncertainty analysis shows that the predicted gust speed at high average recurrence intervals (e.g. R = 2000 ) can vary up to 2.9 m/s due to sampling uncertainty for convective events and up to ∼ 4.5 m/s for synoptic events. A quantification of the uncertainty levels for all the stations adopted for this study for 500-year ARI is shown in Table 2. This analysis shows that the 95% confidence limits can depart by up to ∼ 8.8 m/s from the return level given by the fit of convective events and ∼ 4.7 m/s from the fit of the synoptic events.

Exceedance frequencies
Understanding the characteristics of damaging wind gusts (gusts of speeds higher than 90 km/h (Smith et al. 2012;Brown and Dowdy 2019)) is important in the risk assessment of those events and will be the focus of this section. Figure 4 shows the number of Mapping out the exceedances geographically shows a pattern in the distribution of extreme events across NSW where the frequency of synoptic damaging wind gusts decreases from an average of ∼ 2.25 gusts per year to ∼ 0.36 events per year when going inland from the coast whilst noting one outlier near Canberra where synoptic events are more likely to exceed the 25 m/s threshold than other inland stations. This outlier corresponds to Mount Ginini station (ID: 070349), which is located near a peak where high winds are expected due to local terrain influences. This topographic effect may also have some influence on convective winds, noting that Mount Ginini shows relatively higher localised levels of convective gust intensities compared to other stations near the Australian Capital Territory (Tuggeranong and Cooma Airport), which may be influenced by the topography. The effect of height is accounted for by a height multiplier in the design standards (Standards Australia 2021). The crossing rate of convective events is more uniformly distributed across the state and shows relatively high crossing rates at inland stations. Synoptic events are more frequent near the coast, however, higher risk of destructive convective extreme gusts is observed across wider geographical locations across the state.

Gust strengths for several recurrence intervals
As mentioned in Sect. 1, the design gust speeds of normal, major and post-disaster structures are specified in terms of the 500, 1000 and 2000 years for ARIs, respectively. This section focuses on understanding the gust strength for high ARIs such as those that are important for design purposes. Maps for synoptic and convective extreme wind hazards, corresponding to an ARI of 500 years, are presented in Fig. 5. Maps of 1000 and 2000 years ARIs are not shown here as they show similar patterns to those for the 500-yr ARI. In addition to that full ARI data (up to 10,000-yr ARI) for all the individual stations considered in this study are provided in the supplementary information section online.
Synoptic wind gusts show a similar pattern to the crossing rate distribution presented in Sect. 3.1 where the most extreme speeds ( > 30 m/s) are located at coastal locations. Less extreme, yet still destructive, gusts are predicted at inland locations with speed levels between 20 and 30 m/s. Convective wind gust speeds are, however, more uniformly distributed between inland and coastal stations and extreme gusts higher than 30 m/s can occur at most station locations shown here. For both synoptic and convective types, some locations have return speeds higher than 40 m/s at 500-yr ARI. Those locations could be at risk of high damage if structures are not designed to support such gust speeds. It is also apparent here that the most extreme gusts are dominantly convective events at high recurrence intervals. This is also seen in Fig. 6 that shows a comparison of ARIs between synoptic and convective events using data from six different stations (three located at coastal locations and three other inland stations) that are representative of a common behaviour between different stations. Convective events are dominant at all locations for ARIs above 100 years, although coastal locations (Fig. 6a, b and c) show close contributions from both synoptic and convective wind gusts.

Wind direction
The differences found in frequency and severity of gusts with geographical location (coastal vs inland locations) and the gust producing mechanism (synoptic vs convective) encourages the extension of this analysis to investigate possible patterns in the direction from which the extreme gusts are blowing. The aim is to examine if there are differences in wind direction between inland and coastal locations and if there is a prevailing direction from which the extreme gusts are coming. Fig. 7 shows the direction of daily peak gusts above 25 m/s over the all the years of the data period available from each AWS dataset. Wind direction data from stations that are close in location are represented in one wind rose to avoid presenting a large number of wind rose plots in one figure, which can become uninterpretable. Based on that the following five groups of stations were considered: all coastal locations, all the inland locations near the Sydney area, the inland locations in the north east of the state, the stations near ACT and the south of NSW and finally the stations in the inner west. That is depicted in Fig. 7a and b where each wind rose represents the wind direction data (given in terms of the eight cardinal directions) from multiple stations that are grouped in a certain geographical location.
Wind direction observations show a difference between inland and coastal stations, where extreme gusts are predominately westerlies at inland locations and blow from the west and the south for coastal stations -but predominantly from the south. Those dominant directions are in line with the largest magnitude direction multipliers (for regions A0, A2 and A3) in the Australian standards (Standards Australia 2021). Synoptic events show a much narrower distribution of the direction compared to convective events, which means that the synoptic extreme events are blowing from a more defined dominant direction (i.e. north west) compared to convective extreme gusts. This is also related to the fact that convective events are associated with a sudden wind direction change at the time of the event, whereas the wind direction is more sustained for synoptic events (see Fig. 2). This higher variability of convective gusts direction contributes to the wider distribution shown in Fig. 7b.

Conclusion
Extreme wind gust hazards have been examined considering anemometric data from the last 24 years from 29 AWS stations located in New South Wales, Australia. The aim is to complement the understanding of damaging gusts' properties in terms of their statistical distribution, frequency and average recurrence interval using the peaks-over-threshold approach that is based on the generalised Pareto distributions (GPD). The threshold selection algorithm was outlined, and the model was validated against empirical data whilst discussing sampling uncertainties and confidence limits. Extreme gusts were separated into different storm types (synoptic and convective events) and GPD distributions were calculated for each type of events using the peaks-over-threshold approach. Spatial variations in extreme wind gust speed, occurrence frequency and direction are examined over NSW, noting that this analysis can be insightful to understanding long-term climate changes and risks given that different physical processes may potentially change differently to each other in the future (e.g. small-scale convective systems may change in different ways to synoptic cyclones and frontal systems).
The study of average recurrence intervals shows that wind gust speeds higher than 40 m/s are expected to be plausible at multiple coastal and inland locations corresponding to the higher range of ARIs considered in this study, and buildings located in those areas could be at risk of high damage if not designed to support such high wind loads for rare events such as those. Furthermore, synoptic extreme gusts (with return speeds > 30 m/s) are clustered near the coast and their intensity decreases inland, similar to the behaviour of the synoptic frequency. Buildings and infrastructure are thus exposed to stronger synoptic winds at coastal locations than inland. Synoptic and convective winds have similar intensities at low ARIs at coastal locations which are then dominated by convective winds for ARIs higher than 100 years. Inland locations on the other hand are strongly dominated by convective events. Although designing structures to meet high ARI events can be costly, there may be potential to utilise other results presented here to help with this. For example, the direction of extreme gusts was investigated and maps of wind direction showed a clear difference between inland and coastal stations, noting that wind direction is one of key factors for determining the load on a structure. Extreme gusts (both synoptic and convective types) are found to be westerlies at inland locations and predominantly southerlies at coastal stations.
Exceedance frequencies of gusts higher than 25 m/s (severe gusts) vary by location, from less than one to several severe gusts per year. There was a clear gradient in the frequency of severe synoptic gusts, whose frequency decreases when going inland from the coast, which is not seen for convective gusts. Synoptic gusts were more frequent than convective gusts near the coast, but typically less frequent inland. Additional studies are warranted to see if this pattern persists in other parts of Australia or in other parts of the world.
Our analysis intended to provide a further understanding of the geographical variations of extreme gust characteristics in a coastal midlatitude region. The results contribute to the available range of guidance relevant to planning and design applications for extreme wind hazards in this region and may be relevant more broadly. We here report a systematic trend of gust statistics with location. Such trends could be used to model and hence estimate gust recurrence intervals at other locations in the region. Further work should be done to explore the accuracy of this, and the generality of these trends in other regions. We propose that similar analyses be undertaken for other regions of Australia and the rest of the world, with potential benefits that might be obtained from enhanced 1 3 understanding of the causes and characteristics of extreme wind gusts. This includes in cases where significant variations in these characteristics could be found between different nearby regions, such as was demonstrated here over this region in eastern Australia.