Sensitivity of identifying cut-off lows in the Southern Hemisphere using multiple criteria: implications for numbers, seasonality and intensity

Cut-off Low (COLs) are often associated with heavy precipitation and strong wind events, but there are still uncertainties on how their identification affect the numbers and seasonality. This paper aims to determine the sensitivity of identifying Southern Hemisphere COLs in the ERA-Interim reanalysis to different types of identification criteria. Upper-level cyclones are initially tracked on the 300 hPa level using relative vorticity and geopotential in order to present different perspectives. This reveals significant differences between the numbers and length of the identified tracks for the two fields. To identify the COLs different post-tracking filters are applied which are divided into two steps. Firstly, three filters are considered to separate cut-off lows from open troughs by sampling winds at different offset radial distances from the upper-level cyclonic centres. Secondly, potential vorticity and temperature criteria are imposed to determine how these conditions affect the identified COLs in terms of numbers, seasonality and intensity. It was found that methods based on multiple criteria restrict the COL identification by imposing specific characteristics, while methods based on simpler schemes (e.g. using only winds) can detect larger samples of COLs observed visually in the geopotential maps. Although it is difficult to say which method is more accurate in identifying COLs, because of the subjective aspect of observer’s assessment, a scheme using only winds should be more representative of reality as this simply imposes on the detection system a cyclonic circulation appearance regardless of the physical and dynamical characteristics. Therefore, this type of method could be considered as a standard method for identifying COLs that can be used for either operational or research purposes.


Introduction
Cut-off Lows (COLs) are synoptic-scale low pressure systems that form at upper levels and can be identified as minima in geopotential charts. The historical interest in COLs is motivated by their importance to the local weather as they are often responsible for heavy precipitation and floods in different regions of the world (Singleton and Reason 2006;Llasat et al. 2007;McInnes and Hubbert 2001). COLs also play an important role in stratosphere-troposphere exchange, where air and atmospheric constituents are transported across the tropopause, and occasionally lead to episodes of high ozone concentration in the lower troposphere (Ancellet et al. 1994;Rondanelli et al. 2002). Over past years, there have been great efforts to have a better understanding of observed COLs with several studies focusing on their climatological aspects.

3
The earliest climatologies of COLs were performed for particular regions based on relatively old datasets and manual methods using visual inspection of available charts and satellite images (Taljaard 1985;Price and Vaughan 1992;Kentarchos and Davis 1998). The drawback in using manual methods is that they rely on subjective decisions based on human expertise, requiring an exhaustive analysis for long periods of data. The advances and improvements in observations and computer systems (e.g. more modern reanalysis products and automated techniques for atmospheric system detection) have allowed statistics of COLs to be reproduced fairly consistently over time by using larger samples of data and larger domains than used in earlier studies based on subjective analysis.
Developing objective methods for detecting and tracking atmospheric systems represents fundamental aspects of assessment and intercomparison between reanalyses or model outputs. However, a problem concerning the use of objective approaches is the variety of methods used to detect and track COLs. The idea of considering a unique method to identify COLs and providing consistent results is particularly important, but difficult to realise because of the different criteria often chosen to define the COL features motivated by different conceptual ideas of what a COL is. One of the more subtle points in COL identification is the method by which a COL is defined. Objective methods need to have an identifiable physical basis with specific constraints in order to yield consistent results.
A wide range of techniques have previously been used as a means of identifying COLs or mid-upper level tropospheric systems. These methods differ in the way features are detected and tracked in consecutive frames of data. COLs are typically analysed using local extremes of geopotential or vorticity with additional criteria introduced to determine the cut-off condition, though COLs can also be defined based on a conceptual model in terms of cold core, high potential vorticity (PV) anomaly, and baroclinic zones. Probably the most widely used objective method for COL identification is that developed by Nieto et al. (2005), which is based on the following steps: geopotential minimum, cut-off circulation, thermal frontal parameter, and equivalent thickness. The latter in particular is used to search for a thickness ridge east of the COL centre or, in other words, the cold core criterion, which is applied in the layer 200-300 hPa. Similar methods adapted for the Southern Hemisphere (SH) were used by Reboita et al. (2010) and Ndarana and Waugh (2010) with some differences in the level or layer used to compute the parameters. A particularly intriguing aspect of these differences is the different levels used to detect cold cores and how this choice affects the results. The differences in the methodologies certainly lead to discrepancies between studies, introducing uncertainties between the studies and their conclusions.
There are a number of studies describing different types of methods for the identification of COLs using automatic (Nieto et al. 2005;Ndarana and Waugh 2010;Favre et al. 2012;Pinheiro et al. 2017) or semi-automatic methods (Fuenzalida et al. 2005;Campetella and Possia 2007). Objective schemes appear more accurate in detecting COLs when using several parameters. To identify the COL features, some methods use PV as a stratospheric tracer (Wernli and Sprenger 2007;Ndarana and Waugh 2010) since the process of COL genesis is accompanied by tropopause folding and pronounced intrusions of anomalous cyclonic PV tongues, eventually leading to Rossby wave breaking (RWB) as discussed in Hoskins et al. (1985). More recently, Pinheiro et al. (2017) used a method to identify COLs that combines the search for cold core with high cyclonic PV. These authors suggest that the use of a scheme based on multiple criteria allows specific features to be imposed to the selected sample, though this may reduce dramatically the number of detected systems.
A comparison between two different objective methods was presented by Nieto et al. (2008) for COLs in the Northern Hemisphere. One approach is based on a conceptual model proposed by Nieto et al. (2005) while the other uses PV to follow stratospheric and tropospheric cutoffs (Wernli and Sprenger 2007). The results show large differences in numbers and spatial distribution between the two identification methods, which can be attributed to how each method counts the features per unit area as well as the level where the COLs are identified. For the latter, the authors suggest that the more appropriate level to identify COLs depends on region and season since lower (higher) isentropic levels represent better the COLs occurring at higher (lower) latitudes.
The location of the main areas of COL activity in the SH are generally consistent in the literature, but the numbers and seasonality differ considerably between studies. Therefore, one research question arises as to whether the discrepancies between studies are mainly due to uncertainties in using different reanalyses, as discussed in the recent study of Pinheiro et al. (2019), or whether the method used to identify COLs also has important implications for the results. In relation to the latter, large differences were found between the statistics produced by a tracking algorithm applied to vorticity and geopotential fields, since these focus on different spatial scale features (Hoskins and Hodges 2005).
Since COLs are important for impacts such as heavy precipitation and strong winds, identifying COLs correctly may be helpful both in numerical weather prediction (NWP) and climate models. Thus, this study was motivated by attempts to understand how sensitive the identification of SH COLs are to different methods and schemes. The goal of this study is to examine the sensitivity of identifying COLs to the field used for tracking (relative vorticity or geopotential) and to the different filters used to separate COLs from 1 3 other systems, that make use of simple schemes (using only horizontal winds) and multiple step schemes (using also temperature and/or PV). Hence, we intend to verify how the choice of method used to identify COLs affects their numbers, seasonality and intensity, attempting to demonstrate which method/criteria are the most reproducible and reliable for COL identification, which may be considered as a single standard method to be adopted universally.

Reanalysis dataset
The data used in this study are obtained from the ERA-Interim (ERAI) reanalysis (Simmons et al. 2007), produced by the European Centre for Medium Range Weather Forecast (ECMWF). The six-hourly data are produced with a fourdimensional variational data assimilation (4D-Var) system. The reanalysis is performed using a spectral model using a N128 reduced Gaussian grid (corresponding to approximately 79 km or 0.75°) and 60 vertical hybrid levels. ERAI has an improved atmospheric model and data assimilation system with variational bias correction for satellite radiances compared to the older ECMWF 40-year Reanalysis (ERA-40 reanalysis), which minimizes inconsistencies in the observations (Dee et al. 2011). Sea surface temperature (SST) and sea ice concentration (SIC) are prescribed boundary conditions for the atmospheric forecast model. ERAI was used for a 36-year period (1979-2014) in this study.

Tracking and COL identification
In this study we used the TRACK algorithm (Hodges 1994(Hodges , 1995(Hodges , 1999 to track features of interest. The full process for tracking features of interest in the chosen fields is performed objectively in three main stages, which are: pre-processing filtering, tracking, and post-tracking filtering. Initially the large-scale background is removed by performing a spherical harmonic decomposition of the fields and setting the coefficients of the total wavenumbers less than or equal to five to zero. For the 300-hPa geopotential Z 300 the zonal mean is first removed from the data, i.e., each grid point is subtracted from its mean latitudinal value. The use of the zonal anomaly of geopotential ( Z ′ 300 ) for the tracking allows the "weak" extremes to be more easily identified than in the raw geopotential, particularly at lower latitudes where the geopotential gradient is typically weaker than at higher latitudes.
The tracking of SH COLs is performed using six hourly data based on the 300 and Z ′ 300 minima in order to present different perspectives. The data are also spectrally truncated to triangular truncation 42 (T42) on a Gaussian grid for relative vorticity, as this is a very noisy field, whilst T63 is used for the geopotential field, similar to previous studies (Pinheiro et al. 2017(Pinheiro et al. , 2019. The filtering focuses the identification on the same spatial scale for each data set which allows a fairer comparison between data with different resolutions. In addition to the truncation, a spectral tapering is applied to reduce the Gibbs phenomena in exactly the same way as done in Hoskins and Sardeshmukh (1984).
Candidate features are first identified as minima below chosen thresholds for each field, − 1.0 × 10 −5 s −1 for 300 and − 50 geopotential meters (gpm) for Z ′ 300 , where the minus sign for relative vorticity is required for cyclonic systems in the SH. Results indicate that the number of tracks obtained are sensitive to these thresholds. In general, increasing the magnitude of these thresholds (making them more negative) reduces the lifetime and the number of COLs detected. The use of the weak negative values for the tracking step is appropriate to identify a large number of potential COLs, which will be filtered further in the post-tracking step in order to exclude upper-level troughs. If we set a large threshold for the tracking, weak COLs will be missed in the analysis.
The tracking is performed by first initialising a set of tracks from the feature points using the nearest neighbour approach, these tracks are then refined by minimising a cost function for track smoothness subject to adaptive constraints (Hodges 1999). The tracking is the same as used in Pinheiro et al. (2017) but with some modifications to the adaptive tracking constraints for displacement distance and smoothness. These are determined taking into account the type of motion for the observed systems of interest, supported by a limited sensitivity study which showed the different possible matches for the COL tracks. A description of the adaptive constraints used in this study and their differences from those applied in Pinheiro et al. (2017) are presented in the Appendix. Following completion of the tracking the tracks are filtered to retain those that last longer than 24 h in order to exclude very short lifecycles, i.e. to eliminate or filter an unwanted type of motion (Holton 1992).
In order to identify the COLs from among all the tracked features additional field information is added to the tracks with the aim of detecting a cut-off circulation. This is done by referencing the 300-hPa horizontal wind components ( U 300 , V 300 ) to the tracks at a fixed radial distance of 5° (geodesic distance) from the COL centre in several directions relative to the centre. Similar filters have been previously used as a means of identifying upper tropospheric vortices (Costa 2009;Morais 2016;Nieto et al. 2005). The three types of filters used to detect the possible cut-off low formation are shown in Fig. 1 and described as follows: • Single-point filter: uses a single sample of U 300 at a single offset point (similar to Nieto's algorithm) located at 180° relative to North (Fig. 1a) where the radial dis-1 3 tance is 5° from the 300 or Z ′ 300 minima. If the sample value is negative (easterly direction) for four consecutive steps (1 day), this track is considered a potential COL. • Four-point filter: uses two samples for each U 300 and V 300 at the four different offset points of 0° ( U 300 > 0), 90° ( V 300 < 0), 180° ( U 300 < 0), and 270° ( V 300 > 0) relative to North (Fig. 1b) at the radial distance of 5° from the 300 or Z 300 minima. • Eight-point filter: uses four samples for each U 300 and V 300 at the four different offset points of 45° ( U 300 > 0, V 300 < 0), 135° ( U 300 < 0, V 300 < 0), 225° ( U 300 < 0, V 300 > 0), and 315° ( U 300 > 0, V 300 > 0) relative to the North (Fig. 1c) at the radial distance is 5° from the 300 or Z 300 minima.
As well as the wind identification criteria the impact of using multiple criteria schemes is investigated here. This is done by adding further variables one-by-one for temperature and PV in order to identify the cold-core and stratospheric intrusion conditions, respectively. These are the most often used criteria for identifying COLs, and the sensitivity analysis allows us to verify how temperature and PV as well as their combination affects the COL identification. The full set of criteria is exactly the same as used in Pinheiro et al. (2017), where the COL tracks are referenced to zonal anomalies of the 300-hPa PV ( PV 300 ) and the 300-hPa temperature ( T 300 ) by searching for the minimum value over a spherical cap region of 5° centered on the 300 minima. Tracks with T 300 values lower than − 3 Kelvin and PV 300 values lower than − 2 PVU (1 PVU = 10 −6 m 2 s −1 K kg −1 ) for at least four consecutive steps (1 day) are retained. Additionally, an easterly wind (defined for the zonal wind values lower than − 8 m/s) must occur at 5° south from the 300 minima. A potential COL is defined using a multiple step scheme when the steps described above are satisfied. These criteria were applied independently to the wind filter described above in order to verify the sensitivity of different choices of either of the three parameters ( U 300 , PV 300 and T 300 ) with respect to the COL identification. It is possible to examine the effect of PV 300 and T 300 together with the wind criteria, but this would not change the conclusions.
Additionally, COL tracks obtained using the different methods are individually compared with each other through a track matching algorithm (Hodges et al. 2011), which is used to match the identically same COLs with mean separation distance less than 4° (geodesic) that overlap in time by at least 50% of the track points. Spatial statistics for the identified COLs are computed using spherical kernel estimators (Hodges 1996).
Since there is wide variation in the definition of COLs, a standard definition of COLs needs to be established in order to facilitate the method assessments. This study uses the classical definition of Palmén (1949) for COLs referring to those cyclonic circulation systems in the upper troposphere that are "cut-off" from the westerlies (i.e. a definition that is independent of the physical and dynamical characteristics). To avoid the numerous higher latitude vortices, only the observed COLs located north of 50°S are considered.

Identification using winds
The three different filters used to separate COLs from upper level troughs (see Fig. 1) are first analysed and discussed here. The relative vorticity is the preferred field for the tracking because more systems are identified using 300 than Z ′ 300 . For a 36-year period , 540 tracks per year were identified using the single-point filter, 519 tracks with the four-point filter, and 300 tracks with the eight-point filter. Note that the numbers of the detected COLs using the singlepoint and the four-point filters are similar, but there is a significant reduction in number of identified COLs for the eight-point filter. This reduction is due to the greater number of constraints, i.e., the larger the restriction the smaller the sample size.

3
A particular issue concerning the use of the stricter criteria (e.g. eight-point filter) is that the COL circulation varies through the lifecycle, resulting in problems of using fixed criteria. For example, we often find that many of the observed COLs present an elongated trough with significant westward tilt during the early stages of their lifecycle. This problem is illustrated in Fig. 2 for an individual COL in South America at two different stages. During the formation (Fig. 2a), the vorticity minimum identified by the method (represented by the symbol "x" in white color) is far offset from the COL centre due to the shear component associated with the upper-level flow, as discussed in Bell and Keyser (1993). On the other hand, objective methods work much better for COLs with more symmetric circulation which is more often found in the mature stage, as shown in Fig. 2b. The problem described above is minimised by using simpler criteria such as the one-point or four-point filters. This is a typical example of a problem that could be more accurately diagnosed using machine learning methods based on pattern recognition techniques. For example, such methods could offer the possibility to apply a filter that rotates in such a way that it matches the tilt starting from an initial set of rules, and then updating as more systems were detected.
The comparison between the three filtering methods is also performed by matching the same tracks as done in Hodges et al. (2011). The tracks are compared using a prescribed value for the mean separation distances, chosen here to be 4° geodesic which overlaps in time by at least 50% of the track points. The largest absolute number of matches is observed between the single-point and the four-point filters (398 tracks per year), which corresponds to approximately 74% and 77% of the total tracks found in the single-point and the four-point filters, respectively. Comparing the single-point and the eight-point filters, there is a much lower number of matches (280 tracks per year), equivalent to 52% and 93% of the tracks found in the single-point and the eightpoint filters, respectively. Finally, the comparison between the four-point and the eight-point filters has 244 matches yearly, which corresponds to 47% and 81% of the tracks observed in the four-point and the eight-point filters, respectively. Surprisingly, the number of matches does not increase for the most intense COLs for each filter, as is the case for the comparison of COLs between different reanalyses (Pinheiro et al. 2019). This means the choice of the filtering method to detect cut-off lows does not affect the distribution of the COL intensities, i.e. the matches occur similarly for both strong and weak systems.
The comparison between the three filtering methods is also performed by analysing the maximum intensity distributions of the 300-hPa COLs, determined by finding the 300 maximum (scaled by − 1) along the detected tracks for each filter, and shown in Fig. 3. The 300 maxima are determined searching for the maximum value within 5° geodesic radius centered on the 300-hPa COL centre. It is apparent that there is a similar intensity distribution for each of the filters, which have the maximum frequency and number for values around 10 × 10 −5 s −1 . The extreme tails shown in the top inset graph in terms of frequency (Fig. 3a) show the tail is marginally longer in the eight-point filter than in other filters. However, the difference in number in respect of the extreme tails ( Fig. 3b) is not significant among the filters. The conclusion is that although there are substantial differences in number, particularly with respect to the eight-point filter, the intensity distribution is similar for each filter. Therefore, the choice of the method to detect the cut-off circulation does not affect the type of detected system. Though the results of this study provide quantitative analyses, the authors attempt to find which of the methods are as close as possible to "the ground truth" given the multiplechoice method derived from different variable combinations. An indication for the best method for our purpose was obtained after a visual inspection of the Z 300 maps (not shown) which shows that the highest number of COLs is observed for the single-point filter, followed in order by the four-point and eight-point filters. This means that the single-point filter detects more observed COLs than the other filters, but it also detects more unwanted systems, such as upper tropospheric troughs. We found that about 20% of the tracks identified with the single-point filter do not have an associated closed circulation, as observed visually in the geopotential maps. The method using the four-point filter, on the other hand, reduces the detection of unwanted tracks compared to the single-point filter, and it allows the detection of observed COLs with a number as large as observed in the method using the one-point filter. The eight-point filter retains fewer troughs than the other methods, but excludes many observed COLs. This happens, for example, when the axis of the COL tilts westward in the horizontal maps, as demonstrated in Fig. 2.
A particular observation is that the method detects longer tracks using relative vorticity than those observed in previous studies using geopotential, as the vorticity allows systems to be identified much earlier. These differences are minimised when using the cut-off stage instead of the full life cycle. The cut-off stage counts the time interval between the first and end time steps that satisfies the wind filter, for instance, the four-point filter. The full life cycle indicates the track in which the feature points satisfy the 300 and Z ′ 300 thresholds. By imposing the cut-off stage criterion, the distributions of COL lifetimes between 300 and Z ′ 300 are more similar than those observed in the full lifetime distributions. Figure 4 shows that most COLs last one or 2 days during the cut-off stage, although a few tracks persist for more than 10 days. This result is observed for both 300 and Z ′ 300 tracks. The mean lifetime of the cut-off stage is exactly the same for the 300 and Z ′ 300 COLs (2.9 days), similar to the mean lifetime found in previous studies using geopotential for the COL tracking (Fuenzalida et al. 2005;Campetella and Possia 2007;Reboita et al. 2010).

Sensitivity of identifying Cut-off Lows using cold-core and stratospheric intrusion criteria
The sensitivity of the method to identify COLs is also assessed with respect to different identification schemes in Analysis performed using the ERAI reanalysis for a 36-year period . The insets represent the high-intensity tails. Unit is s −1 terms of numbers, seasonality and intensity. The schemes use the 300 for the tracking and the following additional fields: (1) U 300 ; (2) U 300 and PV 300 ; (3) U 300 and T 300 ; and (4) U 300 , PV 300 and T 300 . As expected, the largest number of COLs are found for the simplest methods (i.e. using only the U 300 ) with 540 tracks per year, whereas the smallest number of COLs identified is found for the multiple criteria methods (e.g. using U 300 , PV 300 , and T 300 ) with 285 tracks per year, i.e. the full set of criteria has about half the number of detected COLs compared to the simplest scheme. The schemes that use U 300 /PV 300 and U 300 /T 300 found 446 tracks and 327 tracks per year, respectively. Figure 5a shows the monthly distribution and Fig. 5b the intensity distribution (computed using the full resolution) of the detected COLs for the four different schemes. The single variable scheme, i.e. using only U 300 , is used as a reference as this scheme detects the largest number of COLs observed in the Z 300 maps. For the single variable scheme compared with the full set of criteria (using U 300 , PV 300 and T 300 ), the seasonality shows relatively small differences as both schemes show the maximum in March and the minimum in August or September. The frequency of the detected COLs differs between the two schemes, but the difference in number remains roughly constant through the year. However, the intensity distribution shows that the full set of criteria has more of the strongest systems than the method using a single parameter. For the comparison of the method using T 300 together with U 300 , again using the single variable scheme (1) U 300 ; (2) U 300 and PV 300 ; (3) U 300 and T 300 ; (4) U 300 , PV 300 and T 300 (Full). The four schemes for the 300 Cut-off Lows are: (1) U 300 (black line); (2) U 300 and PV 300 (red line); (3) U 300 and T 300 (green line); (4) U 300 , PV 300 , and T 300 (blue line). The thresholds used in each variable are: U 300 < − 8.0 m/s; zonal anomaly of T 300 < − 3.0 °C; zonal anomaly of PV 300 < − 2.0 PVU. The zonal anomalies of PV 300 and T 300 are determined searching for the minimum value within 5° geodesic radius centered on the centre of Cut-off Lows. Analysis performed using the ERAI reanalysis for a 36-year period  1 3 as a reference, the largest differences are found in winter when typically the strongest COLs occur. Despite this difference, the seasonality and intensity distributions are similar between the two methods, suggesting that the type of the detected COLs does not change much when the cold-core criterion is used. Therefore, the use of the cold-core criterion seems to affect mainly the winter systems when the decrease in numbers occurs.
The largest difference in seasonality occurs when comparing the method using PV 300 together with U 300 as the maximum frequency is found in May, with a secondary maximum in October, and the minimum in January, indicating a somewhat similar distribution to that obtained for 500-hPa COLs (Fuenzalida et al. 2005; Ndarana and Waugh 2010). The monthly distributions show that the largest differences in number occur in the summer months, but it is relatively small for the period between May and October. The maximum frequency in winter may be as a result of the higher mean intensity of COLs (see Fig. 7), leading to a more pronounced tropopause folding as well as more efficient transport of cyclonic PV values from the stratosphere into the upper troposphere. In contrast, COLs are relatively weak in summer so that the PV intrusion is less effective compared to winter, reducing dramatically the number of detected COLs using the U 300 /PV 300 method. Furthermore, the seasonal variation of the COL number appears to be related to the dynamical tropopause changes with season (Kunz et al. 2011) as the tropopause height reaches its lowermost (highermost) position in winter (summer), directly affecting the COL detection using schemes with PV, as previously demonstrated by Wernli and Sprenger (2007) and Ndarana and Waugh (2010). An alternative approach to reduce the seasonal variation of PV might be the use of temporal anomalies, but this would require an additional calculation.
Except for the scheme using PV 300 together with U 300 , the other identification schemes have a similar seasonality of COLs as the peak and trough occurrence are found at a similar time of year, which is in agreement with previous studies (Ndarana and Waugh 2010; Favre et al. 2012), providing some confidence in how these systems are represented. However, even though the number of COLs identified does change, the number of COLs that occur in reality is unknown. Comparing these results with previous studies of COLs in the SH, Reboita et al. (2010) found substantial differences in frequency when using relatively old reanalyses (NCEP-NCAR and ERA-40) finding the annual average of the 300-hPa COLs is 197 using NCEP-NCAR and 349 using ERA-40. These numbers are much greater than the frequency shown by Ndarana and Waugh (2010) for a similar region for COLs at 250 hPa (120 systems per year) using the NCEP-NCAR and a method to identify COLs similar to that used in Reboita et al. (2010), but with a more explicit imposition of the cold-core. The single variable scheme used in the present work identified an annual average of 547 COLs using the 300 and 355 COLs using the Z ′ 300 in ERAI. However, the frequency reduces to 285 COLs if using the full set of criteria for 300 , i.e. with cold-core and PV restrictions. According to Pinheiro et al. (2019), differences in number are also apparent between the modern reanalyses ERAI, NCEP Coupled Forecast System Reanalysis (NCEP CFSR), the second National Aeronautics and Space Administration Modern Era Retrospective Reanalysis for Research and Applications (NASA MERRA-2), and the Japanese 55-year Reanalysis (JRA-55), although there is a considerable improvement in the agreement between the new reanalyses compared to the older Japanese 25-year Reanalysis (JRA-25). These results confirm that the number of systems is very sensitive to the criteria and dataset used to identify COLs. It is also worth pointing out that it is difficult to fit the COLs into neat classes because the COLs by nature are more complicated than the typical aspects described in conceptual models.

Spatial statistics for relative vorticity and geopotential COLs
In this section, we present results for the comparison of the spatial statistics of COLs based on the relative vorticity and geopotential tracking and the four-point filter scheme (without PV and temperature restrictions), because this minimises issues associated with the increase (decrease) in number of identified troughs (COLs) compared to the other wind filters. The annual track densities for 300 and Z ′ 300 COLs in the SH are shown in Fig. 6. This shows that the Z ′ 300 distribution is very similar to that obtained from the 300 based on the same tracking methodology and selection criteria. The main areas of COLs are located near the continents for both 300 and Z ′ 300 , but the density values differ substantially between the two fields. In general, the 300 densities are greater than the Z ′ 300 densities due to the larger number of COLs identified due to the differences in scales, as discussed in Hoskins and Hodges (2002). In addition to this aspect, there are other obvious differences between the 300 and Z ′ 300 spatial distributions. One is the 300 track density maximum over the central Indian Ocean which is related to weak summer COLs (figure not shown). Another difference is that the 300 COL activity occurs at more northern latitudes in the SH than that observed for the Z ′ 300 , such as seen in northern Australia and South America. There are at least two reasons explaining the latitudinal differences. One is a result of the shear component effect that shifts the cyclonic vorticity maximum equatorward, possibly resulting in the differences of the track densities. The other factor may be due to the presence of stronger geopotential gradients and Laplacians as the COLs move poleward as a consequence of geostrophy.

3
It is worthwhile mentioning that methods using fields that focus on the smaller synoptic scales (e.g. vorticity) tend to identify longer tracks than methods using larger-scale fields (e.g. geopotential). The choice of the tracking field affects the track density estimation since this statistic is sensitive to the track length. As a consequence longer tracks can result in higher track densities, and vice versa. This aspect may be the reason for the track density gaps in the ocean areas observed in the Z ′ 300 compared to the 300 . A similar distribution of the Z ′ 300 COLs has been shown in earlier studies based on the geopotential (Fuenzalida et al. 2005;Reboita et al. 2010).
Results for the number of 300 and Z ′ 300 COLs and the number of matches between the two sets for each season are shown in Table 1. This shows that the annual average is 519.1 COLs for 300 and 407.9 COLs for Z ′ 300 . The difference in number of COLs between the 300 and Z ′ 300 increases when no additional fields are used as criteria (figure not shown). This means the filter used to separate COLs from upper-level troughs (for instance, the four-point filter) makes the identification more stricter and reduces the spatial differences found between the relative vorticity and geopotential fields. Despite the differences in numbers, the seasonality is quite similar between the two fields, although it is somewhat more pronounced in 300 . The matches are determined for the tracks with a mean separation distance less than 4° and an overlap in time greater than 50% of their points. The largest number of matches occurs in autumn with 90.4 tracks representing 81% (62%) of the Z ′ 300 ( 300 ) tracks, while the lowest number of matches is observed in winter with 61.9 tracks representing 73% (63%) of the Z ′ 300 ( 300 ) tracks. The reason for the relatively low number of matches in winter will be discussed later.
The seasonal intensity distribution of the 300 and Z ′ 300 COLs, computed from the full-resolution of 300 and Z ′ 300 , is shown in Fig. 7. Values are determined from the maximum area averaged within a 5° geodesic radius (scaled by − 1), centered on the 300 and Z ′ 300 maxima. The probability density distribution for 300 COLs, referenced to the 300 , provides a similar perspective to that shown in Pinheiro et al. (2017) in their Fig. 6 who used a multiple criteria scheme. The most intense 300 COLs are found in winter, followed by spring, autumn and summer. Similarly the Z ′ 300 distribution, referenced to the Z ′ 300 tracks, also shows a broad range of values and significant differences between the seasons, with the strongest COLs occurring in winter. This last result (a) (b) Fig. 6 Annual track density of Southern Hemisphere 300-hPa Cutoff Lows for a vorticity and b geopotential. Track density in shaded and solid line for contour interval of 2.0 units. Analysis is performed using the ERAI reanalysis for a 36-year period . Unit is number per season per unit area, the unit area is equivalent to a 5° spherical cap ( ≅ 10 6 km 2 ) contradicts what was observed by Fuenzalida et al. (2005) since these authors found no significant seasonal and geographical variations of intensity for 500-hPa COLs based on the Laplacian of 500-hPa geopotential. It is difficult to say what field is best at representing the intensities of COLs because vorticity and geopotential are very different. To verify where the differences in the spatial distribution of the 300 and Z ′ 300 COLs occur, the track matching algorithm is used to construct statistics based on the tracks that match and do not match. Figure 8 shows the track density based on the tracks that match between the 300 and Z ′ 300 , in white solid line, and the tracks that do not match, i.e., the difference between the 300 and Z ′ 300 , in shaded. The results indicate that, in general, the regions of high density of matches coincide with the regions of high density of differences. This result is expected since the regions of matches and non-matches are in the same region as the main track density (see Fig. 6). This occurs for regions of high COL activity located around the main continental areas, in particular in summer and autumn when the density of matches and differences reach 12 and 10 per season (per unit area), respectively. High values for the matches and difference are also found over the oceans during the summer, such as over the Indian Ocean, mostly due to small-scale weak systems. Positive values for the density of difference means that in general there are more 300 tracks than Z ′ 300 tracks. A surprising opposite result occurs over southeastern Australia and parts of the Pacific in winter where the track density has larger values in Z ′ 300 rather than in 300 . The reason for this result is not clear, but it may be explained by the fact that the Z ′ 300 tracks are longer than the 300 tracks in this region (figure not shown), and the longer tracks result in larger values of track density as discussed above. According to Pinheiro et al. (2017), the western Pacific is a preferred region for both COL genesis and lysis, and this aspect may result in uncertainties due to the difficult task of identifying the COL lifecycle. Similar problems are also found over the central Indian Ocean, where the track density values based on the matches are comparable to those based on the nonmatches. This occurs particularly during the winter, confirming the relatively small number of matches between the 300 and Z ′ 300 in this period, as shown in Table 1. To provide a more detailed understanding between relative vorticity and geopotential, a three-month period is used to give a view of how the tracks of COLs look between the two fields. Figure 9 shows the 300 and Z ′ 300 tracks plotted for the period of June, July and August 2010. The symbol 'x' indicates the position of the genesis of each COL, and the colored lines indicate the intensity at each 6-hourly time step in units of 10 −5 s −1 for relative vorticity, and in gpm for geopotential, both scaled by − 1. The most notable differences between the relative vorticity and geopotential are as follows: (1) the number of 300 tracks is greater than the number of Z ′ 300 tracks; (2) weaker systems are likely to be not identified for geopotential, this is the case for some tracks seen in southeastern Brazil and north of Australia; (3) the 300 tracks are normally longer than the Z ′ 300 tracks since the vorticity allows systems to be identified earlier in their life cycle.

Discussion and conclusions
In the literature, there are a number of studies of COLs looking at different aspects and using varying identification methodologies, whose criteria differ widely, introducing uncertainties between the studies and their conclusions. The choice of a method to identify COL features is a difficult task, but it is a point that needs to be considered in order to yield consistent results and help in the COL forecasting, as the COLs can be complicated to follow in time. In this study, we have presented different methodologies that were employed in order to understand how different criteria affect numbers, seasonality and the intensity distribution of austral COLs. The study reported herein does not compare The density of matches and difference have been suppressed were the vorticity track density is below 1.0 per season per unit area. The unit area is equivalent to a 5° spherical cap ( ≅ 10 6 km 2 ). Analysis is performed using the ERAI for a 36-year period (1979-2014) 1 3 our results to those of other methodologies since the study of Pinheiro et al. (2017) has already made a comparison between previous studies. However, our results partially agree with the previous findings (e.g. Ndarana and Waugh 2010; Reboita et al. 2010) regarding the fact that the number and seasonality of COLs are dependent on the level used to identify the systems.Most studies using objective methods have restricted the tracking to relative vorticity, geopotential or geostrophic vorticity. This study has shown that by exploring a wide range of upper-level tropospheric fields, new perspectives can be obtained by identifying different features of 300-hPa COLs. The most obvious differences between using relative vorticity and geopotential to identify COLs is the larger number (length) of 300 tracks compared to Z ′ 300 tracks, and the fact that the weaker 300 COLs are likely to be not identified using the Z ′ 300 . This is particularly the case for COLs at lower latitudes, where geopotential gradients are weaker than at higher latitudes. A similar argument can be applied to summer COLs which are typically weaker than winter COLs.
The comparison of COL identification methods has shown that the use of filtering is needed as a post-tracking step in order to separate COLs from other systems. Regarding the methods using only winds as additional requirements for the identification, the single and four-point filters give quantitatively similar results with the differences between each in general smaller than the differences found between the analysis using the eight-point filter. Despite the potential large differences in number of COLs, the derived intensity distribution is similar for each applied filter. Therefore, the choice of the method to detect the cut-off circulation does not affect the type of detected system. The four-point filter in particular is considered the best compromise for the COL identification among the three types of filters (at least for the study region) because this minimises issues associated with the increase (decrease) in number of identified troughs (COLs). However, this of course involves a certain degree of subjectivity that reflects the complicated nature of COLs due to regional differences and the fact that the majority of these systems have short lifetimes imposing difficulties in defining a reference system.
The analysis of sensitivity in relation to specific variables used to identify COLs allows us to understand how the choice of method and threshold affect the results, and how these introduce uncertainties between studies due to the differences found in terms of seasonality and numbers. The four schemes (one scheme using only winds and three schemes using PV and/or temperature) used to identify COLs pick up distinct types of systems simply because different criteria impose different constraints. The single parameter scheme, which is based only on winds, was found to be less selective than the multiple step schemes, resulting in a larger number of either detected and observed COLs than other methods.
The sensitivity of the method with respect to PV indicates a good choice to detect the strongest systems, but it may be too restrictive in some cases, particularly for weak summer COLs. This result is likely related to how the spatial structure of PV varies with season. Since the PV anomalies associated with COLs can be from the result of the RWB process (Ndarana and Waugh 2010), which in turn is connected with the split jet flow (Berrisford et al. 2007), it is then plausible to assume that the seasonal variability of the jet stream affects the seasonality of COLs. In addition, the COL variations with season are not uniform over the SH due to the regional differences in the jet stream variability, since the shift of the jets throughout the year is greater in Australia and the southwestern Pacific than in other sectors (Archer and Caldeira 2008). This means that the SH COLs exhibit a regionally dependent seasonality, as observed in previous studies (Fuenzalida et al. 2005;Pinheiro et al. 2017).
The identification of a cold core as a condition to detect COLs is another aspect that leads to uncertainties between studies since the cold core search is generally performed at different layers, which seems to be chosen arbitrarily. In this study, the cold-core criterion is imposed at a fixed level (300-hPa), which makes the COL identification effective for most of situations. However, during the period of low activity of COLs (corresponding to winter and early spring) the tropopause in the low mid-latitudes is typically low when compared to other seasons (Appenzeller et al. 1996), leading to incursions of the stratospheric warm pool in the upperlevel COLs and, as a consequence, reducing the number of detected winter COLs compared to methods without a coldcore criterion. This is probably one of the main reasons why many studies found differences in the seasonality of COLs observed in similar regions.
Although it is not possible to say categorically which of these methods is the best choice for COL identification, due to the lack of a climatology that could be used as the reference, simpler schemes based only on winds (in our case the four-point filter) should be more representative of reality since they simply impose on the detection system a cyclonic circulation appearance regardless of the physical and dynamical characteristics. Therefore, this type of method may be considered as a standard method for identifying COLs that can be used for either operational or research purposes, providing some improvement particularly if we wish to identify large samples of COLs.
It is worth mentioning that the additional fields used in the method do not discriminate between COLs that are essentially formed at upper levels and those associated with frontal occlusions, because it is difficult to separate these using objective methods. In fact, there are no criteria to clearly distinguish the upper-level COLs from those originating with a frontal system since both of the two types of COLs often have similar characteristics, for example, a cold core in the middle upper troposphere (Pinheiro 2010). In addition, a characteristic of the conceptual model of Bell and Bosart (1989) is that the COLs are located equatorward of the main westerlies, but this is difficult to be implemented in algorithms, because in practice there is arbitrary choice in defining the jet stream position. Also, it is reasonable to suppose that the choice of method for COL identification may impact on the precipitation associated with COLs. These possibilities could be examined in further studies.
The differences from previous studies raise the question of whether the different conceptual models are due to regional differences in the COL structure. Thus, further study is needed to investigate the COL structure using a common set of analysis methods and applied to particular regions to see how methods would work better if one were able to detect regional and seasonal features. Finally, it is worth considering that, even if similar schemes were employed, the results could vary significantly depending on other aspects such as the level used to identify COLs, region, intrinsic variability, and uncertainties in datasets, such as reanalysis data (Pinheiro et al. 2019). more appropriate for the observed COL motion, makes the tracks closer to the typical observed tracks by excluding possibly more "merged systems" as well as more mobile earlier or later stages of the COL lifecycle.
The track smoothness constraint is measured in terms of changes in direction and speed. This is achieved by specifying values for the upper-bound track smoothness constraint ( max ) which is a function of the mean displacement distances over three time steps. The smoothness constraint is applied adaptively together with the displacement constraint, varying with the local mean separation distance on a track. The values used for the max and for the average displacement over three frames ( d ) in each method are shown in Tables 2 and 3. These values are found to be suitable for the purpose of this study. The smoothness constraint is less restrictive at smaller distances between the track points (and vice versa). For slow-moving systems, for example, larger changes in velocity (speed and/or direction) are expected to occur in a time step in comparison to faster moving systems.