1 Introduction

Heavy rainfall is a significant threat to life and property in many parts of the world, especially when it contributes to flash floods (Johnson et al. 2016; Allen and Allen 2016). Many studies have shown the potential for climate change to impact rainfall intensity, but how it will affect other storm characteristics (size, translation speed, orientation, etc.) remains largely unexplored. The first step towards exploring the potential future changes is to establish an observed baseline for various storm characteristics.

Many climatological studies have used coarse-resolution gridded datasets (e.g., global climate models, reanalysis data) globally and over specific regions. Although the coarse-resolution of these datasets cannot properly capture small-scale processes that can have significant contributions to water availability and natural disasters like flash floods (Dowdy 2020), previous studies have employed these products to establish a thunderstorm climatology based on the concept of favourable conditions for thunderstorms. These conditions usually include a combination of convective available potential energy (CAPE) and vertical wind shear in a region (Ahmed et al. 2019; Brooks et al. 2003; Taszarek et al. 2020; Allen et al. 2011; Groenemeijer et al. 2017). By employing this environmental approach, Allen and Karoly (2014) presented a severe thunderstorm climatology over Australia during 2003–2010, using the ECMWF Interim Re-Analysis (ERA-Interim; Dee et al. 2011) and reported observation data. They showed that these storms are more frequent in December during the afternoon, consistent with the diurnal cycle of surface temperature and the maximum availability of heating. A limitation to this approach is that the environmentally favourable conditions do not necessarily lead to a thunderstorm, causing an overestimated thunderstorm frequency. In addition, most of the studies focused only on thunderstorm frequency. Although the information on other storm characteristics can be provided based on environmental information, including severe winds, rainfall, hail and storm intensity (Dowdy 2020; Brown and Dowdy 2021), the environmental approach is unable to provide more detailed information on other quantitative characteristics such as storm size, translation speed, direction, etc. (Allen and Karoly 2014).

https://www.zotero.org/google-docs/?RxorzJ Using coarse-resolution datasets, previous authors have tried to investigate the effect of natural climate variability (e.g., El Niño–Southern Oscillation (ENSO), the Indian Ocean Dipole (IOD)) and Southern Annular Mode (SAM) on the rainfall over Australia (Ashok et al. 2003; Hauser et al. 2020). Dowdy et al. (2020) employed ERA-Interim data along with lightning and rainfall observations to examine the influence of large-scale drivers such as the ENSO, IOD and SAM on thunderstorm activities over Australia, finding no strong relations between them, consistent with findings of Allen and Karoly (2014). Since precipitation in some regions can be correlated with more than one large-scale driver, and indices are often correlated with each other, some studies such as Maher and Sherwood (2014) used a multivariate rather than bivariate approach to this problem. Note that all of these studies investigated the effects of natural climate variability on rainfall over large-scale regions and uncertainties remain in local areas such as Sydney.

Another approach to study thunderstorms is measuring the occurrence of lightning. Using satellite and ground-based lightning datasets, Dowdy and Kuleshov (2014) found that during the cooler months, a maximum in lightning activity occurs over the ocean to the east of the continent. Earlier similar studies by Kuleshov et al. (2001, 2012) had found that there is high a thunderstorm frequency over the southeast part of Australia along the coastline, most often from spring to early autumn. Similar studies have been conducted globally for regions such as the United States and Europe (Taszarek et al. 2020), Brazil (Pinto et al. 2013) and middle-east (Shwehdi 2005). Although this approach provides valuable information regarding thunderstorms, it only reveals precipitation systems that produce lightning. Additionally, it provides limited information regarding storm frequency and intensity (Dowdy and Mills 2012; Dowdy 2020), and other information like storm size, shape, which are poorly or unrelated to lightning flash rates (Walsh et al. 2016).

Point-based datasets like rain gauges and reports of hail, tornado and wind gusts are other means of studying precipitation systems used in many studies (Bhardwaj and Singh 2018; Enno et al. 2013; Saha and Quadir 2016; Pinto et al. 2013; Groenemeijer et al. 2017; Kelly et al. 1985; Doswell et al. 2005). Using reported datasets over Australia, some researchers have found that severe thunderstorms are most prevalent between October and April (Niall and Walsh 2005; Schuster et al. 2005; Davis and Walsh 2008) with a peak between 3 and 7 pm (Griffiths et al. 1993; Schuster et al. 2005), consistent with the findings of Allen and Karoly (2014) using environmental approach and ERA-Interim reanalysis data. Higher frequency of heavy rain events over Australia in summer was also reported by Dare and Davidson (2015), using daily rain gauge data. Some efforts have also been made to employ point-based datasets to investigate relationships between ENSO and storm events, globally (Cook and Schaefer 2008; Cook et al. 2017; Lee et al. 2016, 2013; Lepore et al. 2017) and over Australia (Chung and Power 2017; Risbey et al. 2009a, b; Murphy and Timbal 2008; Nicholls et al. 1996; McBride and Nicholls 1983; Allan et al. 1996; Schepen et al. 2012; Min et al. 2013; King et al. 2014; Ashcroft et al. 2019). Based on these studies, winter-spring rainfall is usually reduced during El Niño and enhanced during La Niña over the eastern and southeastern parts of Australia. However, the relationships are stronger inland and weak or absent in coastal regions. Although point-based data have been widely used in climatological studies, their availability is highly influenced by the local population density (Allen 2018). In addition, climatological studies using reported data are mainly focused on severe storms rather than non-severe ones. Although gauge networks have the potential to provide information regarding both severe and non-severe precipitation systems, the sparse networks often fail to observe the maximum intensity (Ayat et al. 2018) or even miss precipitation systems altogether (Lengfeld et al. 2020).

The limitations of these approaches have led to the application of remotely-sensed datasets, which can observe the precipitation systems with high resolution in space and time. Satellite precipitation products have been widely used in rainfall climatological studies across the world (Kidd 2001; Atiah et al. 2020; Chua et al. 2020; Morin et al. 2020). However, images from their microwave sensors are available only for a few overpasses per day (Ayat et al. 2021a, b), which may result in biasing the results depending on the diurnal cycle of convection in the respective regions (Punge and Kunz 2016).

High spatio-temporal resolution radar rainfall estimates offer great potential to study a wide range of precipitation systems that appear in their coverage areas, particularly over the regions in which gauge observations are usually sparse or unevenly distributed (Ghaemi et al. 2017; Ayat et al. 2018; Moazami et al. 2014). Radar records in many parts of the globe are temporally limited to just over a decade (Allen 2018). However, this time frame is long enough to conduct climatological studies. Previous researchers have employed these datasets to study the climatology of precipitation systems over specific regions such as the United States (Ghebreyesus and Sharif 2020; Kuo and Orville 1973; Croft and Shulman 1989; Falconer 1984; Cintineo et al. 2012; Murillo et al. 2021), Europe (Kaltenboeck and Steinheimer 2015; Kreklow et al. 2020; Weckwerth et al. 2011; Overeem et al. 2009; Bližňák et al. 2018; Fairman Jr et al. 2015; (Junghänel et al. 2016; Nisi et al. 2016; Fluck et al. 2021; Lukach et al. 2017; Saltikoff et al. 2010), Asia and Australia (Chen et al. 2012; Warren et al. 2020; Dowdy et al. 2020). A limitation of the previous radar-based studies is that they often employed a pixel-based statistical approach which limits the storm properties to rainfall/hail frequency and intensity as a function of position.

Some studies have employed an object-based approach to overcome this limitation, where discrete storm objects are identified and characterized. For instance, Haberlie and Ashley (2019), Poujol et al. (2020) and Prein et al. (2017) applied object-based techniques to radar products to study the climatology of convective storms in the United States. In Australia, Soderholm et al. (2017) employed a cell-tracking algorithm over radar data to study hail climatology in southeast Queensland. Hitchcock et al. (2021) also studied the characteristics of linear precipitation systems and their contributions in heavy rainfall over Melbourne by applying an object-based approach over a weather radar data near Melbourne. Similar object-based radar studies have been conducted in Sydney (Potts et al. 2000), Southeast Queensland (Peter et al. 2015) and other parts of the world such as Germany (Thomassen et al. 2020), Italy (Sangiorgio and Barindelli 2020) and Spain (Rigo et al. 2010). This technique has also been applied over satellite datasets to study the global climatology of different storm types (Nesbitt et al. 2000; Jiang et al. 2011; Wall et al. 2013).

Most of the previous object-based studies were limited to storm number, area, and rainfall intensity, whereas other storm characteristics like translation speed, shape and aspect ratio, orientation, direction and volume could also be of interest. In addition, the object-based techniques employed in these studies are limited by the object split/merge issue, which is a common problem in object-tracking methods and can lead to calculating misleading storm properties (Muñoz et al. 2018). In many tracking algorithms, when a system merges to (or splits from) another system, it is considered as a separate system with its own properties. Handling split/merge events results in including independent small-scale objects (that are connected via split/merge) as a large object. Thus, we will have a better estimation of system frequency and area at each snapshot, which results in a better estimation of other area-dependent properties, such as mean system direction and velocity derived by area-weighted averaging of each property. In addition, considering multiple connected objects (via split/merge events) as a united object at each snapshot enables us to have a more realistic estimation of an object centroid at each snapshot, resulting in having a better estimation of system track and track-length, direction and lifetime, as well.

In this research, we employ the Method of Object-based Diagnostic Evaluation (MODE; Davis et al. 2006) Time Domain (MTD; Clark et al. 2014), which is modified by the authors so as to handle splitting and merging of objects. We apply this technique to the Wollongong radar, located near Sydney, Australia, which has around 20 years of records, to establish an object-based climatology of precipitation in different seasons over the radar footprint areas (i.e., Greater Sydney, Illawarra and other land/ocean regions within 150 km of the radar). We group the main contributing storms with similar object-based characteristics using clustering algorithms, then investigate their relationships with different climate modes.

This study is presented in seven sections. Section 2 describes the Wollongong radar data and its characteristics. Section introduces the object-based and clustering methods along with the statistics employed in this study. Section 3 describes the study area and Sect. 4 presents the results of the object-based climatology over the study area. Section 5 discusses the results shown in the previous section, and finally, the summary of findings is presented in Sect. 6.

2 Dataset

2.1 Radar data

This study uses data from a Bureau of Meteorology operational S-band weather radar located near Wollongong, NSW (34.26° S, 150.88° E, 471 m altitude; (Soderholm et al. 2020). The site experiences partial blocking up to 3 dB in the lowest scan (0.5° elevation) from the northwest to the southeast due to the significant terrain associated with the Great Dividing Range. The archive for this radar started in November 1996 and the site continues to operate as of 2021. However, the study period is limited to June 2018 in this study. Several hardware and configuration changes have taken place during the lifetime of the radar. Initially, the radar operated on a 10-min volume cycle with 16-level reflectivity data (which is sampled at 16 discrete levels: 12.0, 23.0, 28.0, 31.0, 34.0, 37.0, 40.0, 43.0, 46.0, 49.0, 52.0, 55.0, 58.0, 61.0, and 64.0 dBZ). In December 1999 the number of reflectivity levels was increased to 64. Between October 2010 and January 2021, a major hardware upgrade delivered 160-level reflectivity data and a 6-min volume cycle. One significant gap is present in the archive from 1/1/1998 to 15/12/1998.

To ensure the accuracy of reflectivity values across the entire dataset, an absolute calibration technique is applied using precipitation radar measurements from the Tropical Rainfall Measuring Mission (TRMM) and the Global Precipitation Measurement mission (GPM). Satellite overpasses with precipitation are compared with ground radar measurement using the volume matching technique described by Louf et al. (2019), providing a mean calibration value for every pass. Periods of stable calibration are identified (through manual inspection of the stable clutter time series and knowledge of radar inspection dates), and the mean absolute calibration value for these periods is applied as an offset to the ground radar data. Removal of non-meteorological echoes from reflectivity datasets is challenging. In addition to the ground clutter filtering performed by the signal processor, the technique described by Gabella and Notarpietro (2002) is applied using filters for echo continuity and minimum echo area. Unfortunately, this technique is not suitable for removing anomalous propagation which is commonly observed over Tasman Sea. However, we have conducted additional analyses to address this issue (described in Sect. 2.2).

Reflectivity data are transformed to rain rates using a fitted Z–R relationship (Z = A × RB) derived from 9 years of hourly rain gauge data using the Camden Airport AWS (35.04° S, 150.69° E). Note that in the Z-R relationship, Z is the reflectivity (mm6m−3), R is precipitation intensity (mm/h) and A and B are the constant coefficients that are dependent on the region and precipitation type. The A and B coefficients for this radar are 81 and 1.8 respectively. The maximum rain rate is limited to 100 mm/h to limit contamination from hail. Volumetric rain rates are transformed into a Cartesian grid at a 0.5 km altitude using the Barnes weighting function and a 2.5 km constant radius of influence. The final grid has a horizontal resolution of 1 km and a domain size of 300 km by 300 km. However, observations are limited to within 150 km of the radar.

2.2 Challenges with anomalous propagation

Despite all the efforts made in removing non-meteorological echoes, the Wollongong radar site experiences significant anomalous propagation over the ocean within the eastern portion of its coverage. We have conducted further quality control processes to effectively reduce the effects of these clutters in the dataset quality. For further information please refer to the section S.4 in online resources 1.

3 Study area

The study area in this research is the radar coverage regions located up to 150 km from the Wollongong radar, which includes coastal regions of Greater Sydney and Illawarra in New South Wales. The climate of these regions is categorized as humid subtropical, based on the Köppen–Geiger classification (Kottek et al. 2006), and is significantly affected by the coastal position, with small interseasonal variations ranging from cool winters to warm and hot summers (Dare and Davidson 2015). The mean annual precipitation recorded at Observatory Hill (33.86° S, 151.20° E; in Greater Sydney) and Wollongong University (34.40° S, 150.88° E; in Illawarra) locations are 1213.4 mm and 1348.6 mm, respectively (Bureau of Meteorology 2013).

Several factors may impact the precipitation over these regions and its seasonality. Generally, precipitation peaks in the first half of the year and decreases in the second half (Dowdy et al. 2015). In the summer, the easterly (or inland) trough is a major contributor to rainfall over these regions with a peak in the evening. Its impact can be enhanced by interacting with any upper-level troughs or cold fronts crossing over these regions. Frontal systems also bring rainfall to these regions throughout the year, but mostly in winter when the subtropical ridge moves northward over inland Australia (Bureau of Meteorology 2010). Another source of precipitation over these regions is cut-off lows, which can occur at any time of year but are most common during autumn and winter. These can be intense and last up to a week when formed as part of a blocking pair or east coast lows (ECL), in which case they are accompanied by long-lasting heavy rainfall and gusty winds (Bureau of Meteorology 2007). ECLs are among the most dangerous weather systems affecting the eastern coast, and are the reason for many major floods in this region (Callaghan and Power 2014; Dowdy et al. 2019), especially when they are concurrent with other phenomena such as fronts and thunderstorms (Dowdy and Catto 2017). Northwest cloud bands (stretching from northwest to southeast Australia) can also bring precipitation over these regions. They may interact with cold fronts and cut-off lows over southeastern Australia to produce very heavy rainfall over these regions (Reid et al. 2019). Several modes of variability are known to affect precipitation in Australia (Min et al. 2013; King et al. 2014), However, precipitation over the coastal zone, that includes our study area, has shown little relationship (Timbal and Hendon 2011; Pepler et al. 2014; Fita et al. 2017). https://www.zotero.org/google-docs/?BDqHwp.

3.1 Method of object-based diagnostic evaluation (MODE) time domain (MTD)

Here we employed a modified version of MTD, proposed by Ayat et al. (2021b), that considers split/merge events during the lifetime of a precipitation system. In this method, the “objects” at each time step are the connected pixels higher than a specified threshold in the convolved precipitation map (smoothed by an n × n-pixel moving window across the map). Every object at each time-step has a unique label number unless it has overlap with another object at the previous time-step (each blob in Fig. 1c). In this case, it takes the label of the object at the previous time-step. If an object has overlap with two (or more) objects in the previous time step (merging) or two (or more) objects have overlaps with an object at the previous time step (splitting), the object/objects at the current timestep takes/take a new label. Based on these definitions, A “sequence of objects” is the connected objects in time that has a unique label number and doesn’t have split/merge events during its lifetime (connected blobs with the same colours and numbers in Fig. 1c). Finally, a “system” is a group of sequences that are connected via split/merge events (The whole diagram in Fig. 1c). For more details, please refer to the section S.1 in the online resources 1 (supplementary materials).

Fig. 1
figure 1

Panel a shows the location of the Wollongong radar, Camden Airport, Observatory Hill and Wollongong University station. Panel b shows the sequence tracks for systems that occurred on (2018–10-04; UTC). Panel c illustrates the diagram of a tracked system with split/merge events. Note that each blob represents an object at each system snapshot (time step) and each colour shows a sequence

Figure 1b represents an example of running the modified MTD on Wollongong radar data during an event that occurred on 2018-10-4. The lines with different colors are showing the sequence tracks. In this event, land-originating systems had a southeastward direction and later were merged with the systems that had formed over the ocean. The systems originating on ocean and land have different characteristics before merging and they should be studied separately. However, the original MTD considers these systems a system since it doesn’t consider split/merging events in its tracking algorithm. In our approach, we can have access to the information of a system and its member objects at the same time such that at each snapshot the properties of all the objects in a system are accessible as a whole (by averaging) and separately.

The selected threshold to filter the objects is 3 mm/h in the convolved data smoothed by a 3 × 3-pixel moving window across the map. The object characteristics of interest in this study include: (1) area: the number of pixels in the object multiplied by the pixel area (here is 1 km2); (2) translation speed: the ratio of the distance between the volumetric centroid (average of the x and y coordinates of all pixels in the object weighted by the pixel value (rain rate)) of two connected objects in time to the temporal resolution of the dataset; (3) maximum intensity: the maximum precipitation rate within an object; (4) average intensity: the average precipitation rate of all cells within an object; (5) volume discharge: the integration of rain rate within an object; (6) aspect ratio: the ratio of the minor and major axis of the fitted ellipse over an object; (7) direction: the deviation angle from north of the line connecting the centroids of two consecutive objects in a sequence indicating the direction the object is travelling to, and finally, (8) orientation that is the deviation angle from north of the major axis of the fitted ellipse.

The studied system characteristics in this research include: (1) mean area: the average system snapshot areas in the system lifetime; (2) mean volume discharge: the average system snapshot volume discharge in the system lifetime; (3) mean average intensity: the average of precipitation rate in the system lifetime; (4) max average intensity: the maximum of system averaged intensity (calculated at each snapshot) in the system lifetime; (5) mean translation speed: the area-weighted average translation speed of the system snapshots in the system lifetime; (6) mean translation direction: the area-weighted average direction of the system snapshots in the system lifetime; (7) root object count: The number of root objects in the system graph diagram (e.g. in Fig. 1 object numbers 1, 2 and 3 are the root objects in the system diagram) and (8) split/merge count: the number of split/merge events in the system lifetime.

3.2 Clustering analysis

The aim of this section is to find system clusters (types) with similar quantitative characteristics over the study area using both the Agglomerative clustering algorithm and the t-SNE technique. For more information about the t-SNE and Agglomerative techniques please refer to the section S.2 and S.3, respectively, in the online resources (supplementary materials).

3.3 Statistics

Here we employed the non-parametric Kendall’s Tau rank correlation coefficient (tau) to investigate the strength of relationships between variables. This value is derived from the Eq. 1. In this equation, NC is the number of matched pairs and ND is the number of mismatched pairs of the two variables and n is the size of the variables (Wilks 2011).

$$ tau = \frac{{N_{C} - N_{D} }}{{n\left( {n - 1} \right)/2}} $$
(1)

To determine the significance of the difference between the two data distributions, the nonparametric statistical Kolmogorov–Smirnov test (KS-test) has been employed. This technique is a non-parametric test which doesn’t need the input data distributions to be normal. The null hypothesis in this test is that both samples come from a population with the same distribution. This null hypothesis is rejected at the level of ⍺ if:

$${D}_{S}>C\left(\alpha \right)\sqrt{\frac{n+m}{n\times m}}$$
(2)
$${D}_{S}={sup}_{x}\left|{F}_{n}({x}_{1})-{F}_{m}({x}_{2})\right|$$
(3)
$${C}_{\alpha }=\sqrt{-ln\left(\frac{\alpha }{2}\right)\times \frac{1}{2}}$$
(4)

where DS denotes the KS statistic and n, m are the sizes of the two datasets. Fn and Fm refer to empirical distribution functions for both variables, and sup is the supremum function. Note that the supremum function of a subset S of a set K is the least element in K that is greater than or equal to all elements of S (Wilks 2011).

In order to model the relationship between two or more variables, in this research, we have employed the Ordinary least squares (OLS) method which is derived from Eqs. 57. In these equations, n is the number of observations, p is the number of independent variables, y is the n × 1 matrix of the dependent parameter, X is the n × p matrix of independent variables. Suppose b is a "candidate" value for the parameter β which is the n × 1 matrix of coefficients for independent variables.

$$y=\beta X$$
(5)
$$\beta ={argmin}_{{b\epsilon R}^{p}}S\left(b\right)$$
(6)
$$ S\left( b \right) = \left( {y - Xb} \right)^{T} \left( {y - Xb} \right) $$
(7)

An effort has been made in this study to investigate the relationships between climate modes (i.e., ENSO, IOD and SAM) and system properties in each cluster (derived from the previous section). Since precipitation can be correlated with more than one index, and the indices are often correlated with each other (Maher and Sherwood 2014; Taschetto et al. 2011; Mekanik et al. 2013; Mekanik and Imteaz 2012), we have utilized a multivariate approach rather than a bivariate approach to consider the dependency of the climate modes on each other. The indices selected to quantify the mentioned climate modes are Niño 3.4, dipole mode index (DMI) and Antarctic Oscillation (AAO), respectively. Niño 3.4 is the average sea surface temperature anomaly over the Pacific Ocean in the region bounded by 5°N to 5°S, from 170°W to 120°W (Rayner et al. 2003). DMI describes the difference in SST anomalies between the tropical western Indian Ocean (10°S–10°N, 50°E–70°E) and the tropical southeastern Indian Ocean (10°S–0°, 90°E–110°E) (Saji et al. 1999). AAO is defined as the leading principal component (PC) of 850 hPa geopotential height anomalies south of 20°S (Thompson and Wallace 2000). Equation 8 is the model representing the relationship between the three selected indices and each object-based system properties (in each cluster) for every month using a multiple linear analysis to consider the dependency of the selected indices and include the annual variations. In this equation, C0 is the Month constant and C1, C2, C3 are Niño 3.4, DMI and AAO coefficients, respectively. Note that in this equation, each object-based property at every timestep is matched with its monthly climate indices. In addition, in order to derive the partial correlation between every index and storm property, all the coefficients have been normalized according to Eq. 9. In this equation, σ is the standard deviation, “index coefficient” refers to any calculated coefficient from Eq. 8 and “index normalized coefficient” is the partial correlation between every index and system property. Note that the monthly time series of Niño 3.4, DMI and AAO from NOAA, which refer to ENSO, IOD and SAM are accessible from the NOAA Physical Sciences Laboratory (information available online at https://psl.noaa.gov/data/climateindices/list/)

$$System\, property={C}_{0}+{C}_{1}\times Ni\tilde{n} o3.4+{C}_{2}\times IOD+{C}_{3}\times SAM$$
(8)
$$index \, normalized \, coefficient=\frac{index \, coefficient \times \sigma (index \, values \, at \, each \, month)}{\sigma (storm \, properties \, at \, each \, month)}$$
(9)

4 Results

The object-based system properties are compared in different seasons in Sect. 4.1 followed by the detailed analyses of the systems originating on land and ocean. Then, the diurnal cycles of system and object properties are analyzed in Sect. 4.2. Finally, in Sect. 4.3 the main contributing systems with similar object-based properties are clustered, and the effect of climate variability (i.e., ENSO, IOD and SAM) on these clusters is investigated. Note that by applying the object-based technique over the study period, 1,218,787 objects and 35,445 systems have been identified. From the detected objects, 252,636, 277,406, 371,389 and 317,356 objects are detected in spring (SON), summer (DJF), autumn (MAM) and winter (JJA), respectively. The corresponding numbers for systems are 8233, 8091, 10,080, 9041. The similar numbers of objects in the different seasons is entirely consistent with the relatively small variation in rainfall as a function of season, albeit as the clusters show, the forcing associated with this are quite different. Note that many figures in this section use abbreviations in titles, axis labels, etc. due to lack of space. To better understand their reference please refer to table S8 in the supplemental materials.

4.1 Seasonal analyses of system/object properties

Figures 2 and 3 represents the PDFs of the system and object properties, respectively, in different seasons. According to these figures, here we define the “average”, “typical” and “extreme” values of a property as the mean, mode and the average of the top 5%, respectively. Figures 2, S1 and Table S1 show that systems in summer and spring tend to move towards the east-southeast (Fig. 2a) and are larger (~ 30% in average size; Fig. 2d and ~ 45% in volume; Fig. S1h) compared to the autumn and winter systems which usually move northward. Extreme rain intensity is higher the warmer the season, since systems with maximum average intensity above 40 mm/h (top 10% of systems in maximum intensity) during spring, summer, autumn and winter have occurred 1194, 1662, 1182 and 441 times, respectively, within the study timeframe. However, comparing the mode of rain intensity PDFs (Fig. 2b) shows that typical systems in autumn are more intense (15% in mean intensity and 30% in max intensity) than the other seasons. Note that systems in summer tend to include more root contributing objects and split/merge events than other seasons (see Table S1), and the differences are clearer in the top 10% (Fig. S1a, b).

Fig. 2
figure 2

System properties’ distributions for different seasons. Note that the bins in the x-axis are equally spaced in the logarithmic scale. Panel “c” is also presented in polar coordinates. Purple, red, orange and blue lines refer to spring, summer, autumn and winter, respectively. The y-axis in cartesian plots and radius in polar plots show the normalized frequency ranging from 0 to 100. Note that all the angles shown in panels “c” are azimuthal angles. For more information regarding other system properties please refer to Fig. S1 in online resources 1

Fig. 3
figure 3

The same as Fig. 2 but for object properties. For more information regarding other object properties please refer to Fig. S2 in online resources 1

According to Fig. 3 (and Fig. S2 in online resources 1), objects in different seasons have almost the same characteristics as systems described above. However, the differences between seasons are smaller, and they look similar in some properties, such as area (Fig. S2a). Typical summertime objects are mostly oriented around 42° from north while in autumn and spring the object orientation angles change to around 66° and in winter objects are mostly oriented west–east (~ 100; Fig. 3a). Note that in autumn objects and systems tend to move slower (between ~ 10–20%; Fig Fig. 2c) and look slightly more symmetric (Fig. 3b) than in other seasons. In addition,

Precipitation systems may have different characteristics based on their origination source and their locations on land or ocean, due to the distinct air mass properties and environmental characteristics that exist over these two surface types. Here, we compare the average and extreme systems properties originating on land (L-systems) vs. ocean (O-systems) in different seasons by comparing their PDFs shown in Fig. 4 (and with more details in Fig. S3, Tables S2 and S7 in online resources 1). The results show that L-systems, on average, are at least ~ 50% larger (Fig. 4a, b) and move ~ 10% faster (Fig. 4e, f) than O-systems in all seasons. In spring and summer, systems mostly (~ 55%) initiate over land and ~ 14% of total systems reach the ocean. The ones reach the ocean have higher maximum (~ 40–50%; Fig. 4c, d) and mean average intensity (~ 5–15%; not shown) while raining over land compared to when they are raining over the ocean. However, systems in autumn and winter often (~ 60%) originate over the Tasman Sea and less than 10% of them reach land (see Table S7). During these two seasons, O-systems are more spatially concentrated with higher rain rate and smaller areas than L-systems. In summer and spring, both types of systems (land and ocean) generally move towards the east-southeast (Fig. 4g). However, in autumn and winter, they have different directions (Fig. 4h). L-systems still tend to move east-southeastward, but O-systems usually move northward. Note that extreme (top 5% rain intensity) L-systems during autumn have rain intensity comparable to those of O-systems (see Table S2).

Fig. 4
figure 4

PDFs of system properties for systems originating on land (solid lines) and ocean (dashed lines). All differences between land and ocean distributions are statistically significant based on the Kolmogorov–Smirnov test. NL indicates the number of systems originating on land and NO refers to the system originating on the ocean. For more information regarding the systems’ properties in other seasons, please refer to Fig. S3 in online resources 1

Further detailed comparisons are also shown in Figs. S5 (area), S6 (max mean intensity) and Table S4 in Online Resources 1 for L-systems that reach the ocean (LO-systems), L-systems that remain over land (LL-systems), O-systems reach the land (OL-systems) and O-systems remain over the ocean (OO-systems).

Next, we examine how the characteristics of LO- and OL-systems change during the transition of systems between land and ocean, by comparing the PDFs of system snapshots over land and ocean. Note that the term “system snapshot” here refers to the system at a specific time-step. LO-systems during summer/spring, on average, have ~ 15–30% (table S3) higher max average intensity over land than later over the ocean (Fig. 5c). However, during the autumn and winter, these differences are smaller (Fig. 5g). Summer and spring OL-systems tend to be ~ 45% smaller over the ocean (Fig. 5b) without much change in rainfall intensity (Fig. 5d). In addition, autumn and winter OL-systems are also more spatially concentrated with ~ 15% higher max intensity (Fig. 5h) and 15–30% smaller sizes when they are raining over the ocean compared to when they are over land (Fig. 5f). Note that there is not much change in mean area of LO-systems after hitting the ocean (Fig. 5a, e). More detailed comparison can be found in Fig. S4 in Online Resources 1.

Fig. 5
figure 5

PDFs of system snapshot properties originating on land and transitioning to the ocean (panels in the first and third columns) and originating on the ocean and transitioning to the land (panels in the second and fourth columns). Solid and dashed lines are related to the part of the systems that are over land and over the ocean, respectively. S_no refers to the number of storms that start from land and reach the ocean or vice-versa, and S_snp_L and S_snp_O refer to the number of system snapshots over the land and the ocean, respectively. Except for the first column, all differences between land and ocean distributions are statistically significant at 0.01 level based on the Kolmogorov–Smirnov test. For more information regarding the systems’ properties in other seasons, please refer to Fig. S4 in online resources 1

4.2 Diurnal cycle of object properties

The diurnal cycle varies with season and with the location of storms (land/ocean), summertime objects have an afternoon peak around 15:00 Australian Eastern Standard Time (AEST = UTC + 10 h) in terms of size (Fig. 6d) and intensity (Fig. 6e). However, winter objects don’t have such a peak during the day (Fig. 6j–l). In all seasons except winter, high-intensity objects over the ocean, on average, are more intense (~ 5–15%) around 20:00–09:00 AEST (Fig. 7h) compared to their land counterparts, but in the afternoon and evening, the opposite is true with a larger difference (up to 50%). During spring and summer, fast-moving ocean objects, on average, move faster (up to 20%) than land objects during 10:00–20:00 AEST with a peak around 15:00 AEST (Fig. 6i). However, during the other times of the day, extreme land objects have faster translation speeds of up to 20%. Note that in winter, these peaks are less clear and land objects are mostly faster than their ocean counterparts (Fig. 6l). For more information, please refer to Fig. S9 in Online Resources 1.

Fig. 6
figure 6

Diurnal cycle of object properties overall (black; panels ac) and for different seasons (panels df). The results are also presented over land and ocean for different seasons in panels (gl). NL and NO show the number of objects in each season over land and ocean, respectively. For more information regarding the systems’ properties in other seasons, please refer to Fig. S9 in online resources 1. The uncertainty bands are also shown in Fig. S9

Fig. 7
figure 7

Panel a is the projected 5-dimensional object-based system data on a 2D map. The colour on this map shows the density. Panel b shows the variation of CHI with the number of clusters. Panel c represents the clustered groups of the 2d map (panel a) using the Agglomerative method. Panels dh show the PDFs of different properties for the clustered systems. The annual cycle of each cluster is shown in panel (i)

4.3 Clustering analysis

Large- or small-scale atmospheric conditions may result in different types of precipitation systems with unique characteristics over a specific region. Identifying these groups and their typical characteristics and frequency over the region helps us to better understand the rainfall climatology of the region and better investigate the effect of large- or small-scale drivers on rainfall over that region. Using clustering algorithms, we have grouped the systems with similar object-based properties over the study area. Here, we employed the Agglomerative clustering technique over the projected multi-dimensional object-based precipitation system data on a 2D map using t-SNE algorithm. The selected input system properties in the clustering algorithm should not be highly dependent on each other. Therefore, we have calculated the correlation between all pairs of the studied object properties to identify the dependent properties. The correlations between the system/object properties (Fig. S10) show that area vs. volume discharge and maximum intensity vs. average intensity are highly dependent on each other and should not be used together as the input in the clustering algorithm. Therefore, the selected input properties (those left after excluding volume discharge and average intensity) for the clustering algorithm are: (1) mean area, (2) mean translation speed, (3) max average intensity and (4) mean translation direction which is decomposed into x and y components and have been considered as two independent input variables in the clustering algorithm. Note that we have clustered the systems rather than objects, since the motivation here is to identify different types of precipitation systems.

The results show that there are five system clusters (types) with similar object-based properties occurring over the study area. This number is based on the optimum value of CHI against the number of clusters (Fig. 7b; for more details see Sect. 4.2). Figure 7d–h show the PDFs of system properties for different system types with similar quantitative characteristics that have been identified using the Agglomerative and t-SNE algorithms over the study area. Based on these results, the individual system types have the following characteristics:

  • Type 1 (T1) systems have a peak frequency in autumn and include mostly middle size systems (with an average size of ~ 133 km2) and have the lowest translation speeds (~ 23 km/h) but very high rainfall intensities (second-highest; with mean average intensity of ~ 5.4 mm/h) compared to other groups. Around 69% of these systems originate on the ocean and often stay over the ocean (~ 56% of total) while moving towards the north. This type often features more isolated oceanic convection

  • Type 2 (T2) systems often originate over land (54%) move south-eastward and often include the fastest systems (~ 53 km/h) and appear in large sizes (~ 327 km2) but with low rainfall intensities (mean average intensity of ~ 4.3 mm/h). They occur throughout the year with a frequency peak in spring.

  • Type 3 (T3) systems mostly occur during winter with a frequency peak in June. They often originate on the ocean (66%) and remain over the ocean (58% of total) while moving northward. They often include the very slow systems (~ 29 km/h) with the smallest sizes (~ 64 km2) and low intensities (mean average intensity of ~ 4.6 mm/h) compared to the other types.

  • Type 4 (T4) mostly occur during the summer over land (54%) and often includes the most extreme systems in terms of rainfall intensity (mean average intensity of ~ 8.15 mm/h) and often appears in large sizes (260 km2) moving eastward with low translation speeds (~ 32 km/h). Some of these storms resemble mesoscale convective systems.

  • Type 5 (T5) systems mostly include very fast systems (47 km/h) with small sizes (76 km2) and low rainfall intensities (4.6 mm/h) that often occur during the winter, mostly originate on the ocean (64%) and stay over the ocean (58% of total) while moving northward.

Tables S5 and S6 in the Online Resources 1 present more detailed information about the system types. To further demonstrate the characteristics of each cluster, a video representing the typical systems for each cluster is provided in the supplemental material (Online Resources 2 or https://www.youtube.com/watch?v=5o40ul5iirc).

In order to understand the relations of these clusters with synoptic conditions, we have looked at mean sea level pressure and 200 hPa BoM weather maps for 10 typical cases for each cluster. The common synoptic conditions are shown in Fig. 8. Note that the 10 typical cases in each cluster are the top 10 highest values in Fig. 7a. According to these weather maps, T2 and T4 systems seem to be often related to surface troughs and fronts. For T2 cases there is usually a high-pressure system somewhere east of the study area, consistent with the southeast-ward flow (Fig. 8b). This is not so common in the T4 rain systems and when there is a high-pressure system, it is usually farther south (Fig. 8d). Note that T2 and T4 systems include large-scale heavy storms as all the top 1% sized systems are T2 (364 out of 400) or T4 (36 out of 400) systems. The 99.9th size percentile cases were generally elongated/oriented parallel to surface trough axis (see Fig. S17; Online Resources 1). T1, T3 and T5 systems seem to result from the low-pressure systems that generally form and/or intensify on the Tasman Sea. T3 mostly features a high-pressure system south of the mainland, and there are no fronts over eastern Australia (Fig. 8c). This is consistent with their low translation speed. Although a few T1 or T5 cases have this feature, it is not common among them. T5 seems to be related to a low-pressure system closely interacting with a high-pressure system, creating tight isobars and fast systems over the ocean (Fig. 8e). In addition, upper-level troughs over the Tasman Sea are often observed in 200 hPa weather maps for typical T1, T3 and (more clearly) T5 systems (Fig. 8f, h and j).

Fig. 8
figure 8

Mean see level pressure (MSLP; ae) and 200 hPa (fj) weather maps (provided by Bureau of Meteorology; http://www.bom.gov.au/australia/charts/archive/index.shtml) for typical rain system example from each storm types

We also investigated whether significant relationships exist between system properties and climate mode indices. Although no statistically significant relationships were found when investigating all systems, we have found some significant relationships between climate indices and system properties in each cluster (identified in the previous section) using a multiple regression model described in Sect. 4.3. Based on this approach, 75 regression models have been produced (3 indices × 5 storm properties × 5 clusters). In order to identify robust associations, the coefficient for a given index in Eq. 8 must have the same sign (positive or negative) for at least five consecutive months and it must be significantly different from zero in at least three of these months. Since these climate indices often have impacts on weather for a period from 3 to 9 months, this restriction helps us to better exclude those short periods in which precipitation has a statistically acceptable link with climate indices but probably not in reality. Thus, from all of the results, only five regressions passed this criterion and are shown in Fig. 9. The results show that T3 and T5 systems have negative correlation with ENSO in cold season with lower rainfall intensity in El Niño and higher rainfall intensity in La-Niña (Fig. 9a, b). ENSO also shows a relationship with T1 and T3 mean translation speeds during the warm season with a positive correlation (faster in El Niño and slower in La-Niña; Fig. 9c, d). Finally, IOD has also shown to have a positive correlation with T1 mean translation speed from mid-summer to early winter (Feb, Apr and Jun; Fig. 9e). Note that in Fig. 9, all the coefficients have been normalized to derive the partial correlation between every index and system property according to Eq. 9.

Fig. 9
figure 9

Selected regression coefficients for the relationship between climate mode indices and the object-based system properties in different months. The detail of selection criteria is explained in the manuscript. The coloured bars show the normalized coefficients that are significantly different from zero. Note that the colours represent the system types and are matched with the system types’ colours in Fig. 7

5 Discussion

The results presented in Sect. 4.1 are broadly consistent with previous studies, but with some notable exceptions. For example, during summer, systems are mostly larger, move faster and have higher rainfall intensities compared to the systems in winter (Fig. 2). This is in agreement with the previous studies over Australia reporting that severe thunderstorms are most prevalent in the warm seasons (Niall and Walsh 2005; Schuster et al. 2005; Davis and Walsh 2008; Warren et al. 2020). The extreme summer and spring systems in terms of rainfall intensity are more intense when they are raining over land between 10:00 to 20:00 (AEST; peaking in the afternoon) compared to when they are raining over the ocean, consistent with previous studies looking at thunderstorms (Griffiths et al. 1993) and hailstorms (Schuster et al 2005) over Sydney, using weather reports. Note that the system directions in different seasons are also in agreement with the average wind directions at 850 hPa level in ERA5 Reanalysis data (Hersbach et al. 2020). For more information, please see Figs. S15 and S16 in Online resources 1.

The results show that there are more precipitation systems during autumn and winter than spring and summer. However, previous studies reported that thunderstorms are more prevalent during the warm seasons. The first reason behind this contradiction is that most of those studies were focused on the severe thunderstorms or storms with deep convective clouds and high storm tops that are often accompanied by electrification, which are less frequent during the cold seasons. In addition, they often employed ground observations that provide information for storms only over land (Niall and Walsh 2005; Schuster et al. 2005; Davis and Walsh 2008) whereas, based on our findings, autumn and winter systems mostly initiate from the ocean and tend to move northward, causing more precipitation over the ocean than land (see Table S7). Thus, previous studies using gauge data and reports omitted the storms over the ocean in these seasons. Using lightning records, Dowdy and Kuleshov (2014) also showed that a maximum in lightning activity during the cooler months occurs over the ocean to the east of Australia, which is consistent with our results. However, they reported a higher frequency of thunderstorms during the warm season. Since precipitation systems in cold season are often small-scale with low rainfall intensity, many of them are not accompanied by lightning. Therefore, a large number of systems over the ocean during this season are probably missed in the mentioned study. A probable reason for higher frequency of rain systems in cold season is discussed later.

Systems’ characteristics also change during the transition of storms between land and ocean like the decrease in max intensity for summertime land-originating systems when moving over the ocean and wintertime ocean-originating systems when moving over land. These variations are probably related to a change in boundary layer instability, and shows the immediate effect of change in air mass characteristics (land/ocean) on the storms. These characteristics can include surface temperature and humidity, sea/land breezes and topographical interactions, the effects of elevated mixed layers advected over the coast, low-level wind shear and convergence.

Using clustering analysis, we have found that there are five precipitation system types with similar object-based properties over this region (Fig. 7) and described in detail in Sect. 4.3. The most frequent system types in spring, summer, autumn and winter are T2, T2, T1 and T3, respectively (Table S10). Among these clusters, T1 systems have higher rainfall intensity which is probably the reason why in autumn typical rain systems have higher rain rate compared to the other seasons. Note that according to table S10, cold-season rain-systems often include T1, T3 and T5 systems. Typical rain systems in these categories are often coincident with upper-level divergence caused by upper-level troughs over the Tasman Sea which causes destabilizing the surface weather patterns that permits deep convective updrafts. These conditions over the ocean seem to be more frequent in cold season than the conditions in warm season over land to cause precipitation.

Furthermore, under weakly forced warm season environments (e.g., T2 and T4), local processes can play a significant role in the initiation and evolution of thunderstorms. Potts et al. (2000), May et al. (2004), Wilson et al. (2004) and Keenan et al. (2003) noted that mesoscale convergence associated with the sea breeze can result in enhanced storm activity across the coastal plains of Sydney. In addition, Potts et al. (2000) reported the terrain to the west of Sydney frequently acts to initiate thunderstorms, which then, move eastward across the coastal plain, and decay as they move out over the ocean. The observed storm modes with different directions reported in these past studies are likely the result of the interactions between local processes and the large-scale forcing.

Among these clusters, three system types might be accompanied by high-impact weather, due to their special characteristics. T1 systems have less risk of flash floods in this region since they are often raining over the ocean and are not large in area. However, the ones that move toward the west-northwest and have enough area can create flash floods if they hit land, such as the high-impact event on 30 May 2011, with a dominant contribution of this type of system to flash flooding in some of Sydney's eastern suburbs. Another example is the event on 28 Feb 2014 with incidents of flash flooding have been reported across lake Macquarie. T2 systems can also generate severe winds such as their contributions in the event that occurred on 8 January 2003, and brought a severe wind gust with a maximum of 109 km/h. They also had a dominant contribution in the storm with the severe wind gust of 91 km/h on 9 Feb 2017. Finally, T4 systems often include severe storms that can create devastating flash and riverine floods in this region. Their dominant contribution in the East Coast Low event on 1 and 2 February 2005 and severe storm on 25 Apr 2015 caused hail and flash floods in this region.

Our results in investigating the relationships between climate mode indices and system properties are consistent with previous studies reporting a weak or no correlation between ENSO and rainfall intensity (Pepler et al. 2014; Ashcroft et al. 2019; Risbey et al. 2009a, b) and frequency (Allen and Karoly 2014; Dowdy 2020) in coastal eastern and southeastern Australia. However, we have found a signal between ENSO and T3 and T5 system types. Note the signals are still weak and could be subject to uncertainties due to serial correlation in the data.

In summary, the results show that the precipitation system intensity variations are consistent with diurnal/seasonal cycles and for some system types are related to climate mode oscillations. However, other characteristics of the precipitation systems like size and translation speed do not seem to always follow the same relationship. This suggests that further investigations are required to find a more definitive answer to the effect of atmospheric parameter variations (e.g. temperature, humidity, etc.) on system properties other than intensity.

6 Conclusion

In this study, we establish an object-based precipitation system climatology using an S-band weather radar located near Wollongong, NSW with more than 20 years of records (1996–2018). The study area is the radar coverage region (including land and ocean), within 150 km of the radar. Here, we employed the Method of Object-based Diagnostic Evaluation (MODE) Time Domain (MTD) to detect and track the precipitation systems. Using this object-based approach helps us to better understand the climatology of precipitation system properties (other than rainfall intensity and frequency) that haven’t been explored in the previous studies over the study area.

The extreme precipitation systems in terms of size, intensity and translation speed are more frequent during summer and spring. Precipitation systems in these seasons mostly originate on land, move towards the east-southeast, are larger, faster and more intense compared to the systems originating on the ocean. In these seasons, between ~ 10am and ~ 8 pm (AEST), precipitation systems raining over land (wherever they originate) are often larger and have higher rain rate but move slower compared to when they are raining over the ocean (with a peak in rain intensity and translation speed in the afternoon that is consistent with the diurnal maximum of boundary layer instability). However, later at night into early morning, precipitation systems raining over the ocean are slower, larger and more intense compared to when they are over land.

Although severe storms are more frequent during summer and spring, typical storms in autumn have higher rainfall intensity compared to the other seasons. In addition, precipitation systems are more frequent in autumn and winter compared to summer and spring. During the cold season, precipitation systems mostly initiate from the ocean and tend to move northward, which causes more precipitation over the ocean than land. Ocean-originating systems during these seasons like summer and spring are typically smaller and move slower but have higher rainfall intensity than the land-originating systems which still tend to move east-southeastward.

The results show that the change in the air mass characteristics (land/ocean) can immediately affect the precipitation systems’ properties. For instance, the land-originating systems that cross to the ocean in summer are more intense over land than ocean. However, in winter, ocean-originating systems that hit the land are more intense over ocean than land. These changes in precipitation systems’ properties during their lifetimes might be related to the differences in surface temperature and humidity, sea/land breezes and topographical interactions, the effects of elevated mixed layers advected over the coast, low-level wind shear and convergence. Although the biggest driver is likely to be larger instability over the warmer surface, further research is needed to find a definitive answer.

We have found five types of precipitation systems with distinct object-based properties using clustering techniques. By looking at synoptic charts for some typical system types, we have found that T1, T3 and T5 systems peak in cold-season and all have cooler, maritime southerly to south-easterly surface flow and stronger upper-level troughs, supporting the destabilization of the surface onshore flow over the study area. Further, southerly flow for T3 and T5 is sourced significantly further south (Tasman Sea) and therefore is comparatively cooler, contributing to the lower observed intensity of storms. Surface convergence from an offshore trough may also enhance convective storms in T1, leading to the observed increased intensity compared to T3 and T5. This type of system (T1) has the potential to create flash floods when they move onshore. West to north-westerly flow during T2 and T4 brings a warmer airmass over the Sydney region ahead of an inland trough. The trajectory of this airmass along eastern Australia likely also entrains a modified maritime airmass, resulting in a relatively moist, warm airmass over central New South Wales. In addition to the higher equivalent potential temperature surface air of T2 and T4 cases, the inland trough also contributes to enhanced surface convergence. Note that these systems are more frequent in warm season, and can be a source of devastating flash floods and severe wind gusts over this region.

We also studied the connection between climate modes and precipitation systems’ properties for different clusters using a linear multivariate approach and the results show that during El-Niño events in cold seasons, T3 and T5 systems have negative correlation with ENSO with lower rainfall intensity in El Niño and higher rainfall intensity in La-Niña. In addition, ENSO has association with T1 and T3 system translation speeds during the warm season with a positive correlation (faster in El Niño and slower in La-Niña). Finally, IOD has also shown a positive correlation with T1 system translation speeds from mid-summer to early winter (Feb, Apr and Jun). Note the signals are still weak could be subject to uncertainties due to serial correlation in the data.

This study showed the object-based approach offers great potential to better understand other aspects of the precipitation systems which is not possible using a pixel-based approach. In this study, we have conducted a comprehensive study to understand the precipitation system climatology which allows future study of any changes in their characteristics due to climate change. However, this study is limited to a specific region in Australia, and further research is needed to conduct the same study over the other regions or over a larger scale using regional/global datasets. In addition, there are lots of questions unanswered in understanding the precipitation system structures and the effects of other contributing parameters like temperature, humidity, topography, surface type, etc. on the precipitation system properties, which needs further investigations.