Dynamics and predictability of cold spells over the Eastern Mediterranean

The accurate prediction of extreme weather events is an important and challenging task, and has typically relied on numerical simulations of the atmosphere. Here, we combine insights from numerical forecasts with recent developments in dynamical systems theory, which describe atmospheric states in terms of their persistence (θ−1) and local dimension (d), and inform on how the atmosphere evolves to and from a given state of interest. These metrics are intuitively linked to the intrinsic predictability of the atmosphere: a highly persistent, low-dimensional state will be more predictable than a low-persistence, high-dimensional one. We argue that θ−1 and d, derived from reanalysis sea level pressure (SLP) and geopotential height (Z500) fields, can provide complementary predictive information for mid-latitude extreme weather events. Specifically, signatures of regional extreme weather events might be reflected in the dynamical systems metrics, even when the actual extreme is not well-simulated in numerical forecasting systems. We focus on cold spells in the Eastern Mediterranean, and particularly those associated with snow cover in Jerusalem. These rare events are systematically associated with Cyprus Lows, which are the dominant rain-bearing weather system in the region. In our analysis, we compare the ‘cold spell Cyprus Lows’ to other ‘regular’ Cyprus Low days. Significant differences are found between cold spells and ‘regular’ Cyprus Lows from a dynamical systems perspective. When considering SLP, the intrinsic predictability of cold spells is lowest hours before the onset of snow. We find that the cyclone’s location, depth and magnitude of air-sea fluxes play an important role in determining its intrinsic predictability. The dynamical systems metrics computed on Z500 display a different temporal evolution to their SLP counterparts, highlighting the different characteristics of the atmospheric flow at the different levels. We conclude that the dynamical systems approach, although sometimes challenging to interpret, can complement conventional numerical forecasts and forecast skill measures, such as model spread and absolute error. This methodology outlines an important avenue for future research, which can potentially be fruitfully applied to other regions and other types of weather extremes.


Introduction
Cold spells are a major weather-related hazard, causing premature excess mortality (Peterson et al. 2013;Ballester et al. 2016;Ryti et al. 2016), agricultural losses (e.g., Ferrarezi et al. 2019) and ecosystem damage (Boucek et al. 2016). The Eastern Mediterranean has been identified as a climate change hot-spot (Giorgi 2006; Barcikowska et al. 2020) and, although it is typically associated with warm weather and heat waves, has experienced damaging cold spells in recent decades. Moreover, the frequency of colds spells may not decrease as fast as may be naively expected under global warming, since variability may also increase (Kodra et al. 2011;Gao et al. 2015).

3
The Eastern Mediterranean climate conditions are characterized by moderate air temperatures during the winter season and dry and stable hot weather conditions during summer (Goldreich et al. 2003). The winter season is characterized by the passage of extra-tropical cyclones, named Cyprus Lows, that tend to develop or deepen over the Levant when upper level troughs reach the region (e.g., Alpert and Reisin 1986;Alpert et al. 1990a;Shay-El and Alpert 1991;Trigo 1999;Flocas et al. 2010;Ziv et al. 2015). These lowpressure systems typically persist for 1-4 days (Alpert and Ziv 1989). On rare occasions, Cyprus Lows drive severe cold spells associated with snowfall over the higher parts of the Levant, including Jerusalem. The weather forecasters in the region have long considered these cold spells as challenging to predict (Wolfson and Adar 1975;Bitan and Ben-Rubi 1978;Goldreich 2003).
A notable episode in this respect was the severe cold spell named 'Alexa' (11-14 December 2013), which lead to widespread snowfall and mayhem in the region. The damage was estimated at ~ 100 million US dollars, making it the region's costliest natural disaster ever recorded (https ://ims.gov.il/he/ Clima teRep orts). Part of the disruption can be reasonably ascribed to the poor short-term forecasts of the event, which severely underestimated the associated precipitation, wind intensity and snow depth (Hochman et al. 2019). The ability to predict this type of events, and more generally weather events lying in the tails of the respective distributions, is therefore of crucial importance.
The practical ability to predict specific atmospheric configurations is strongly dependent on the details of the numerical model being used, which presumably reflects our broader understanding of the dynamic and thermodynamic processes in the atmosphere, and the accuracy of initial conditions (Lorentz 1963;Slingo and Palmer 2011). A widely-adopted approach for diagnosing the a priori practical predictability of the atmosphere is the spread of ensemble weather forecasts (e.g., Buizza 1997;Hohenegger et al. 2006;Ferranti et al. 2015). However, one may also consider the intrinsic predictability, which only depends on the properties of the atmosphere itself and of the considered atmospheric configuration. Recent developments in dynamical systems theory allow to describe instantaneous configurations in terms of the local dimension (d)-which informs on how the atmosphere evolves to and from the state of interest-and persistence in phase space (θ −1 )-which can be understood as persistence in time (Lucarini et al. 2016;Faranda et al. 2017a). These metrics are intuitively linked to the intrinsic predictability of the atmosphere: a highly persistent (low θ), low-dimensional (low d) state will be more predictable than a low-persistence (high θ), high-dimensional (high d) one (Messori et al. 2017). Different interpretations of predictability depend on different factors or may emphasize different aspects of the atmospheric circulation, and thus show discrepancies in direct comparisons . Indeed, while the notions of practical and intrinsic predictability are linked, there is not always a direct correspondence between the two (Scher and Messori 2019).
The dynamical systems approach has been successfully applied to a variety of climate fields and datasets (Faranda et al. 2017a(Faranda et al. , b, 2019aMessori et al. 2017;Buschow and Friedrichs 2018;Rodrigues et al. 2018;Brunetti et al. 2019;Hochman et al. 2019Hochman et al. , 2020De Luca et al. 2020a, b;Pons et al. 2020). Specifically, it has been shown that d and θ −1 can provide an objective dynamical characterization of synoptic systems over both the North Atlantic (Faranda et al. 2017a;Messori et al. 2017;Rodrigues et al. 2018) and the Eastern Mediterranean (Hochman et al. 2019(Hochman et al. , 2020. Hochman et al. (2019) have provided evidence that the synoptic-scale surface atmospheric pattern associated with the 'Alexa' cold spell was relatively persistent (low θ), yet with a high local dimension (high d). This is a rare combination, since there is typically a positive correlation between d and θ (e.g., Rodrigues et al. 2018).
The above studies have mostly focussed on snapshots of atmospheric configurations. Here, we systematically evaluate the potential of the dynamical systems approach to describe the temporal evolution of weather extremes and complement conventional numerical forecasts. Specifically, we hypothesize that the signature of extreme weather events may be reflected in the evolution of the dynamical systems metrics, even when numerical forecasting systems provide a poor representation of the extreme itself. In this respect, cold spells associated with snow in Jerusalem ('cold spells' hereafter) are used as a case study, and are compared to Cyprus Lows not leading to snow.
The paper is organized as follows: Sect. 2 describes the data and methods used, including a brief description of the dynamical systems analysis, the methods for evaluating the forecast spread/skill of an event and an air parcel backward trajectory tracking methodology. Section 3 presents a seasonal evaluation of the dynamical systems metrics, the dynamics of cold spells with respect to 'regular' Cyprus Lows and a detailed dynamical analysis of the 'Alexa' storm. Section 4 summarizes and concludes the study.

Data
We identify cold spells using a complete list of events with snowfall in Jerusalem, provided by the Israeli Meteorological Service for the period 1948-2015. The synoptic classification used to identify Cyprus Low days (Sect. 2.2) and the dynamical systems analysis (Sect. 2.3 and Appendix 1) are based on the National Centre for Environmental Prediction/ National Centre for Atmospheric Research (NCEP/NCAR) Reanalysis Project daily and 6-hourly data for 1948-2015 on a 2.5° × 2.5° horizontal grid (Kalnay et al. 1996). This spatial resolution is adequate for the analysis. Indeed, Faranda et al. (2017a) have shown that the general conclusions, which can be drawn from a dynamical systems analysis, are largely insensitive to the dataset's horizontal resolution, as long as the major spatial structures characterizing the field of interest are resolved. However, the air parcel tracking (Sect. 2.4) requires data on a reasonably high horizontal and vertical resolution to be accurate, and is thus applied to 6-hourly European Centre for Medium Range Weather Forecasting (ECMWF) ERA-Interim data for 1979-2015, on a regular 1° × 1° horizontal grid and 60 vertical levels (Dee et al. 2011).
The numerical forecasts are taken from the Global Ensemble Forecast System (GEFS) reforecast v.2 dataset produced by NCEP/NCAR (Hamill et al. 2013). Operational Numerical Weather Prediction (NWP) models are updated regularly, e.g., roughly every 6 months in the case of the ECMWF model. Therefore, operational forecast archives are typically inhomogeneous, and cannot easily be used for studies over long periods. So-called reforecasts, occasionally termed hindcasts, are a way to circumvent this problem. These datasets consist of a homogenous set of past forecasts using a single NWP model. The dataset used here provides a homogeneous set of daily ensemble reforecasts from December 1984 to present. Each reforecast consists of a control forecast and a ten-member ensemble on a 0.5° × 0.5° horizontal grid.
To evaluate the forecasts, we leverage a first-of-its-kind homogenised station data set over Israel. Instrumental meteorological records may be affected by a number of non-meteorological factors, such as station relocation, instrumental issues, environmental changes near the station and more. These artificial factors can hamper the accuracy of the data and thus ought to be corrected by homogeneity techniques (Aguilar et al. 2003). Our dataset consists of five representative stations in Israel with a continuous, high-quality record of minimum temperatures for 1950-2017 (Table S1, Figure  S1; Yosef et al. 2018).

The semi-objective synoptic classification
The semi-objective synoptic classification of Alpert et al. (2004a) is used to identify Cyprus Lows. This classification was found to closely reproduce the local weather conditions over the Eastern Mediterranean, especially for Cyprus Lows (Saaroni et al. 2010a, b;Dayan et al. 2012;Hochman et al. 2018a, b). The classification is based on air temperature, geopotential height and wind components U and V at 1000 hPa from the NCEP/NCAR reanalysis over the Eastern Mediterranean (27.5° N-37.5° N; 30° E-40° E; Figure S1).
Such methodology allows us to define two subsets of Cyprus Low days: cold spell Cyprus Low days or 'regular' Cyprus Low days. The reader is referred to Alpert et al. (2004a) for a description of the full classification procedure.

Dynamical systems metrics
A recently developed method combining extreme value theory with Poincaré recurrences allows computing the instantaneous properties of chaotic dynamical systems (Lucarini et al. 2016;Faranda et al. 2017a). In the present study, we consider a temporal succession of latitude-longitude maps for a given atmospheric variable, which we interpret as a long trajectory in phase space. Each map corresponds to a single point along this trajectory, for which local properties are computed. The analysis focuses on two metrics, namely the local dimension (d) and the persistence (θ −1 ). As outlined in Sect. 1, these two metrics can be linked to the intrinsic predictability of the atmosphere.
The local dimension (d) is estimated from the system's recurrences around the state of interest, for example a specific two-dimensional atmospheric map of sea-level pressure (SLP). Mathematically, the procedure stems from the result that the cumulative probability distribution of suitably defined recurrences of the system converges to the exponential member of the Generalized Pareto Distribution (Freitas et al. 2010;Lucarini et al. 2012). In practice, d measures the geometry of the trajectories in a small region of the system's phase space. It is therefore linked to the number of active degrees of freedom that a system can explore locally, or alternatively to the way the system approaches and departs from a given state.
The persistence (θ −1 ) of a state is computed by estimating the extremal index (Moloney et al. 2019), here obtained using the estimator developed by Süveges (2007). θ −1 is a measure of the persistence time of the system in the abovementioned small region of the phase space. θ −1 tends to be sensitive to small changes in the state of the system. However, Hochman et al. (2019) found that relative differences in the dynamical systems persistence may be directly linked to the more conventional notion of differences in the persistence of synoptic systems. For a brief overview of the computation of the above-mentioned metrics, the reader is referred to Appendix 1. Further details can be found in Lucarini et al. (2016) and Faranda et al. (2017aFaranda et al. ( , 2019a. Here, we specifically compute d and θ −1 for daily and 6-hourly SLP and 500 hPa geopotential height (Z500) fields from the NCEP/NCAR reanalysis over the Eastern .5° N; 30° E-40° E, same as for the synoptic classification; Figure S1). We further evaluate the dynamical systems metrics' sensitivity to the size and location of the domain they are computed on by testing three other domains: (1) a large domain [27.5° N-45° N; 15° E-50° E]; (2) a domain as large as the synoptic classification region, but centred on the average location of Cyprus Lows, i.e., east of Cyprus [30° N-40° N; 30° E-40° E]; (3) a domain extending more to the east to better capture eastward-shifted cyclonic features, such as those associated with cold spells [30° N-40° N; 30° E-45° E]. Figure S2 shows the different domains used for the sensitivity analysis. These sensitivity tests are implemented for both the daily distributions and the 6-hourly temporal evolutions of the dynamical systems metrics. To study the sensitivity of the dynamical systems metrics to the depth and location of Cyprus Lows, we follow the semi-objective synoptic classification of Alpert et al. (2004a).
To understand the differences between cold spells and 'regular' Cyprus Low days, we analyse both the CDFs and the mean temporal evolution of the two groups of days in terms of d and θ −1 . The Kolmogorov-Smirnov (for the CDFs) and Wilcoxson Rank-Sum (for the medians) tests are used for evaluating the differences between the two sets of days at the 5% significance level. A bootstrap sampling procedure is used to evaluate the 95% confidence intervals of the mean temporal evolutions. For the latter, we focus our analysis on the first day of every cold spell or Cyprus Low and at 00UTC (0 h in Figs. 6 and 9), which should roughly correspond to the time of lowest temperature.

Forecast spread/skill
In order to obtain a numerical ensemble forecast, several forecasts are performed either with slightly different initial conditions, and/or with slightly different model formulations or stochastic components in the model. Unlike deterministic forecasts, ensemble forecasts provide an effective way of characterizing uncertainty in an operational setting, by computing the ensemble spread. This is typically quantified as the standard deviation between members. Forecast spread can be interpreted as a measure of practical predictability. Conventionally, a small spread is interpreted as the forecast giving a high degree of confidence regarding the future weather (in other terms, a high practical predictability). On the other hand, a large spread is interpreted as advising more caution (a low practical predictability). This type of approach is extensively used in the study of atmospheric predictability (e.g., Buizza 1997;Hohenegger et al. 2006;Ferranti et al. 2015), although it has some limitations (see e.g. Hopson 2014, and references therein). We specifically note that the forecast spread for individual forecasts may provide a useful indication of predictability, but should not be over-interpreted.
Another oft-used forecast diagnostic is the absolute error, which provides a measure of forecast skill. In this study, we use the station dataset described in Sect. 2.1 above as ground truth to compute the forecasts' absolute error. The GEFS reforecast gridded data is bilinearly interpolated to the station locations in order to remove biases due to elevation differences between the model grid and the stations. For each station, the systematic bias calculated over the whole period is removed.
The forecasts are initialized once a day at 00 UTC and are available at three-hour lead-time intervals. Since our analysis focuses on cold spells, we estimate the spread/skill for minimum temperature and SLP at a lead-time of 69 h. Given the three-hour interval of the forecast data, and considering that the stations measure the minimum temperature at some point between 20 and 20 UTC of the next day, we argue that this time-window most closely resembles the definition of the minimum temperature for the station data. As for the dynamical systems metrics (Sect. 2.3), we centre our analysis on the first day of every cold spell or Cyprus Low and at 00UTC (0 h in Figs. 8 and 11), which should again roughly correspond to the time of lowest temperature.
As a measure for the practical predictability on a given day, we use the spread and error at a lead time of 69 h for the forecast initialised on that day. Since the dynamical systems metrics provide information on the evolution of the atmosphere in the vicinity of a given reference state, we argue that this quantity is the most indicative for relating the dynamical systems and numerical forecast analyses. In the supplementary information, we also provide additional figures with a shifted time axis, which show the spread and error for the forecast initialised 69 h prior to the marked time. In other words, the plots in the main paper show forecast initialisation times, while those in the supplementary information show the times at which the forecasts are valid. A bootstrap sampling procedure is used to evaluate the 95% confidence interval of mean forecast spread and error. The Kolmogorov-Smirnov (for the CDFs) and Wilcoxson Rank-Sum (for the medians) tests are used for comparing forecast diagnostics on different sets of days at the 5% significance level.

Air parcel tracking
Ten-day kinematic backward trajectories are computed using the Lagrangian Analysis Tool (LAGRANTO; Sprenger and Wernli 2015). The vertical and horizontal wind components needed for the trajectory computations are taken from the ERA-Interim reanalysis (Dee et al. 2011, Sect. 2.1). A schematic overview of the typical steps for trajectory computation is provided in Fig. 2 in Sprenger and Wernli (2015). The trajectories are initialized at 00UTC from each grid point in the study region on the first day of a cold spell or a ´regular´ Cyprus Low ( Figure S1). In order to analyse the near surface air masses, we consider trajectories that are initialized every 30 hPa between the surface and 850 hPa. Insights about the physical properties of the air parcels are obtained by tracking the temperature, potential temperature, and specific humidity along the trajectories. Changes in potential temperature along the trajectories can be attributed to diabatic processes such as condensational heating, radiative heating and sensible heating (e.g., Bieli et al. 2015). Increases in specific humidity indicate moisture uptake due to evaporative processes in or above the boundary layer, turbulent fluxes, or convection (e.g., Sodemann et al. 2008). One important limitation in the trajectory computation is that the horizontal and vertical wind fields do not resolve convection explicitly. Still, case studies and climatological studies reveal that the ERA-Interim reanalysis is suitable for the Lagrangian process understanding (e.g., Martius and Wernli 2012).

Seasonality of the dynamical systems metrics
Previous studies have shown that the dynamical systems metrics d and θ, have a strong seasonal dependence (Faranda et al. 2017a, b;Rodrigues et al. 2018). Thus, we remove the seasonal cycle before comparing the various events. The seasonal cycle is estimated by averaging the metrics for a given date and time (e.g., 5th January at 00UTC) over all years, repeating this operation for all timesteps within the year (i.e., from 1st January to 31st December) and finally smoothing the series with a 30-day moving average. The comparison between cold spells and 'regular' Cyprus Lows presented in the following sections are all performed using de-seasonalized values of d and θ. The seasonal cycle of d and θ is computed on SLP and Z500 over the Eastern Mediterranean. For the former, d and θ display roughly in-phase winter and summer minima (high intrinsic predictability) and maxima in the shoulder seasons, out-of-phase by 1-2 months (low intrinsic predictability; Fig. 1a). The two minima are comparable for d, while for θ the summer minimum is more pronounced than the winter one. Cyprus Lows are frequent yet not dominant during winter (~ 33% of the winter days; Hochman et al. 2018a). Thus, the wintertime average values of the dynamical systems metrics likely reflect the occurrence of other synoptic classes such as high-pressure systems, which display lower θ and d values than the Cyprus Lows (higher intrinsic predictability; Hochman et al. 2019). Indeed, the median d value for Cyprus Lows is 5.78, while for other days it is 5.39. The same is true for θ (0.81 vs. 0.75). Both differences are significant at the 5% significance level under the Kolmogorov-Smirnov (for the CDFs) and Wilcoxson Rank-Sum (for the medians) tests. On the other hand, the Persian Trough, which dominates the summer months (~ 90% of the summer days; Alpert et al. 2004b), is known to be a relatively persistent and stable atmospheric configuration (Alpert et al. 1990b) and indeed displays very low θ and reasonably low d values (Hochman et al. 2019). Hence, the Persian Trough does account for the bulk of the summertime values of the dynamical systems metrics. A similar pattern is obtained for Z500 (Fig. 1b), although the spring peak in d is shifted towards the summer months, and the d summer minimum is less marked.
The maxima of d and θ in the shoulder seasons, corresponding to high local dimensions and low persistence, reflect that these are periods when a variety of different synoptic systems affect the region (e.g., Alpert and Ziv 1989;Krichak et al. 1997). The coexistence and interaction of both winter and summer systems leads to a high-dimensional, unstable flow. In Faranda et al. (2017b), seasonal maxima in the local dimension of atmospheric flows were interpreted as saddle-like points of the atmospheric dynamics. The above supports the notion that the dynamical systems metrics are modulated by the seasons and reflect synoptic configurations in the Eastern Mediterranean.

Dynamics of cold spells over the Eastern Mediterranean
Virtually all cold spells in the Eastern Mediterranean are associated with Cyprus Lows (96% of the ones considered here). However, it is quite rare for a Cyprus Low to actually lead to a cold spell associated with snow cover in Jerusalem.
From an atmospheric dynamics' viewpoint, the cold spells are associated with a pronounced upper level trough in Z500 and a more eastern cyclone centre than other Cyprus Lows (Fig. 2). This induces a cold northerly flow, occasionally even coming from the Arctic and/or Siberia (Wolfson and Adar 1975;Goldreich 2003). The backward trajectory air parcel analysis illustrates that the flow preceding a cold spell follows a more northerly and continental pathway than the sample for all Cyprus Lows (cf. Fig. 3a and b). The main difference in the air parcels' physical properties is that the initial potential temperature, temperature and specific humidity of the cold spells' parcels are lower than the Cyprus Lows' ones by about 12 K, 10 K and 1 g kg −1 , respectively ( Fig. 3d-f). The differences in potential temperature between the two groups can mainly be attributed to the poleward origin of the air masses associated with the cold spells. From a dynamical systems point of view, the cold spells and Cyprus Lows also exhibit substantial differences. Figure 4 shows the CDFs for d and θ computed on SLP and Z500. θ is significantly lower for the cold spells relative to Cyprus Lows, i.e., the snow events are generally more persistent (Fig. 4b, d). We note however that a small number of 'regular' Cyprus Lows do display relatively high persistence (low θ computed on SLP), but do not lead to snow in Jerusalem (Fig. 4b). A more complex picture arises for the local dimension. For SLP, there is no significant difference between the two CDFs and their medians according to the Wilcoxson Rank-sum and Kolmogorov-Smirnov tests (Fig. 4a). On the contrary, Z500 shows a significantly lower d for the cold spells than for all Cyprus lows (Fig. 4c).
Next, we test the sensitivity of the dynamical systems metrics to the depth and location of the Cyprus Low using both the Kolmogorov-Smirnov and Rank-Sum tests at the 5% significance level. We find that deeper Cyprus Lows display significantly higher values of d and θ (low intrinsic predictability) computed on SLP (Fig. 5a, (Fig. 5c, d). This suggests that both the location and depth of a cyclone play an important role in determining its intrinsic predictability. Computing d and θ on Z500 shows a completely different picture. In this case, deep Cyprus Lows show significantly lower d and θ values (high intrinsic predictability) relative to shallow lows (Fig. 5e, f). However, Eastern lows still show significantly higher d and lower θ relative to Western lows (Fig. 5g,  h). This again advises that both the depth and location of the Cyprus Low play an important role in determining the intrinsic predictability also at Z500. These results also point to the different ways in which upper and lower level flows reflect on the intrinsic predictability of the atmosphere. Figure 6 displays the average temporal evolution of d and θ, again computed for both SLP and Z500. The data is centred on 00UTC of the first day of snow or Cyprus Low (0 h in the Figure). Still, precipitation associated with a Cyprus Low event may occur a few hours following the development of the cyclone. Substantial differences are found between the cold spells and 'regular' Cyprus Lows. For SLP, cold spells typically display an above-climatology d, which increases up to small negative lags and then decreases rapidly (Fig. 6a). θ shows a similar pattern, but mostly displays near-climatology persistence in the lead up to the event. The 'regular' Cyprus Lows display a less pronounced life-cycle, characterised by lower values of d and a slightly lower persistence in the days preceding the onset of the event (Fig. 6b). The build-up to the cold spells is therefore high-dimensional (pointing to a low intrinsic predictability), yet with a near-climatological or even slightly above-average persistence, which is probably a pre-requisite for intense cold air mass transport to the region. Interestingly, the peaks in d and θ occurring in the ~ 48 h prior to the event onset coincide with the Fig. 2 Sea level pressure (shading in hPa) and 500 hPa geopotential height (white contours in m) composites for cold spells associated with snowfall in Jerusalem, Cyprus Low days without the snowfall events, and their difference. a Cold spells mean composite; b Cyprus Lows without snowfall mean composite; c difference between cold spells and other Cyprus Lows time interval for which the majority of the cold-spell air parcels reach the Mediterranean Sea and rapidly increase their potential temperature, temperature and specific humidity at approximately 900 hPa height (Fig. 3). This may point to the role of upward sensible and latent heat fluxes, when cold air interacts with the warm Mediterranean Sea, in affecting the dynamical properties of the atmospheric flow. These fluxes play an important role in the generation and intensification of Cyprus Lows, and may thus conceivably influence their predictability (Stein and Alpert 1993;Alpert et al. 1995). We provide further evidence to support this hypothesis by comparing d and θ computed on SLP for the upper and lower 10% of Lagrangian changes in specific humidity. The changes in specific humidity are calculated along the trajectories of events between − 48 h and 0 h. As shown in Fig. 7, d(θ) is significantly higher (lower) for large changes in the specific humidity. Though this statistical analysis does not demonstrate causal relationship, it does provide a plausibility argument for the important role the rate of moisture uptake, likely due to air-sea fluxes, plays in determining the intrinsic predictability of Cyprus Lows and particularly cold spells.
The dynamical systems metrics computed on Z500 show a radically different picture. The temporal evolutions of d and θ for the cold spells are again in phase with each other, but now show a minimum in the hours preceding the events' onsets (Fig. 6c). The 'regular' Cyprus Lows again display a more subdued life-cycle (Fig. 6d). The apparent contradiction between the SLP and Z500 results may be partly reconciled by considering the vertical structure of Cyprus Lows. We hypothesise that the Fig. 3 Median backward trajectory for a Cyprus Lows and b cold spells with circles indicating days (from 10 days before onset to onset), trajectory density 10 days before onset (shading in number of trajectories per 1000 km 2 ), and trajectory density for the indicated time lags (5, 2, 1 days before onset, contours denote densities of a 20 trajectories per 1000 km 2 and b two trajectories per 1000 km 2 ). Streamlines of 800 hPa winds averaged between − 5 to − 1 days before onset are included. Median evolution of c height (hPa), d potential temperature (K) e temperature (K), and f specific humidity (g kg −1 ) of air parcels. Cyprus low events are indicated in red and cold spells in blue. The inter-quartile range is plotted for the physical properties of the air parcels. 0 h corresponds to the respective event onset, namely the first day of the event and at 00 UTC two flanks of the upper level trough have lower intrinsic predictability (higher d and θ) than the central part of the trough. As the cyclone develops, we therefore expect the dynamical systems metrics to show a minimum in Z500 (Fig. 6c), since the leading flank is the first to enter the domain, followed by the trough centre (see for example Figures S2 and S3 for the 'Alexa' cold spell). In addition, the maxima of d and θ computed for SLP occur several hours before the minima calculated for Z500. This likely reflects the westward tilt with height of the low-pressure systems, so that the upper level trough reaches the region later than the surface low ( Fig. 2a and an example for the 'Alexa' cold spell in Figure S3). Finally, we note that variability in the temporal evolution of the dynamical systems metrics across the different cold spells is smaller in Z500 than in SLP (shading in Fig. 6). The different temporal evolution of the dynamical systems metrics points to the value in using different variables at different levels to obtain a complete picture, but also to the interpretational challenges posed by this analysis approach.
We also assess the sensitivity of the dynamical systems metrics to the size and location of our geographical domain. No qualitative differences are found for shifts or small increases in the domain size (not shown). The only noticeable difference is found for the local dimension (d) distribution computed on SLP and for the largest domain Fig. 4 The de-seasonalized dynamical systems metrics' (d and θ) cumulative distribution functions for cold spells associated with snowfall in Jerusalem (blue curves) and other Cyprus Lows (red curves). The dynamical systems metrics are computed on: a, b sea level pressure-SLP and c, d 500 hPa geopotential height-Z500. Significant differences between the CDFs and medians are marked with p < 0.05 (Sect. 2.3, Figure S4a). For such a large domain, the dynamical systems analysis captures a large part of the subtropical high features alongside the cyclonic features ( Figure S2). We therefore conclude that the dynamical systems metrics are in our case largely insensitive to domain size, as long as the main features of the synoptic system of interest are captured within the domain, and We analyse next numerical ensemble forecasts from the GEFS reforecast dataset for both Cyprus Lows and cold spells. Cold spell forecasts typically display a higher spread than other Cyprus Lows at lead times of 69 h (Fig. 8b, f). The spread for both peaks at negative lags, indicating that forecasts initialised prior to the events are more uncertain than forecasts initialised during or after the events (Fig. 8a, e). This agrees with the information provided by the dynamical systems metrics computed on SLP, but contradicts the metrics computed on Z500. No significant differences are found between cold spells and Cyprus Lows in terms of the mean model absolute error (Fig. 8d). However, the two types of events exhibit a different temporal evolution, with the cold spells showing a peak in the absolute error 48 h prior to the event onset (Fig. 8c). The peak at 48 h corresponds to forecasts initialised just before the largest changes in d and θ computed on both SLP and Z500 (cf. Fig. 8c with Fig. 6a, c). This points to the possibility that sharp changes in the values of the two dynamical systems metrics, corresponding Fig. 6 The average temporal evolution of the de-seasonalized dynamical systems metrics (d and θ) for cold spells associated with snow cover in Jerusalem (a, c) and other Cyprus Lows (b, d). The dynamical systems metrics are computed on: a, b sea level pressure-SLP and c, d 500 hPa geopotential height-Z500. The events are centred on the first day of a cold spell or Cyprus Low and at 00 UTC. A 95% bootstrap confidence interval is shown in shading to a rapid change in the properties of the atmospheric flow, may indicate lower practical predictability. The corresponding plots for forecast valid time (see Sect. 2.4), are provided in Figure S5. As the spread/skill relation for cold spells is relatively poor, only limited conclusions can be drawn from the ensemble reforecast analysis and from the parallels with the dynamical systems metrics. Generally, the spread/skill at the individual stations is comparable to the average forecast spread/skill (not shown). The largest difference is found for Elat station (Table S1). This station is located at the southernmost tip of Israel and therefore may not be influenced by most Cyprus Lows and cold spells ( Figure S1).

A detailed analysis of the 'Alexa' cold spell
The evolution of storm 'Alexa' is analysed as a case study (Fig. 9). The SLP and Z500 patterns for the first day of 'Alexa' are comparable with the average configuration of a cold spell, albeit with a deeper upper level trough and a deeper surface cyclone (cf. Fig. 9a with Fig. 2a, noting the different colour ranges). Moreover, the temporal evolution of 'Alexa' from a dynamical systems point of view (Fig. 9b, c) closely resembles the climatological signature of an average cold spell, only with larger absolute values (cf. Fig. 9b, c with Fig. 6a, c). These are not only due to the fact that we are not averaging over several events; indeed, 'Alexa' is a cold spell episode with one of the largest ranges of d and θ values over the considered time range. The 'Alexa' event is also located in the upper 1% of Cyprus Lows for d computed on SLP (Hochman et al. 2019), and in the lower 5% of d and θ computed on Z500. This suggests that 'Alexa' is not only an extreme event concerning its weather and impact, but also from a dynamical systems point of view. The deep Eastern Cyprus Low associated with 'Alexa' may have influenced the extreme values of the dynamical systems metrics (see Sect. 3.2).
The air parcel analysis reveals that storm 'Alexa' may be considered an archetype for Eastern Mediterranean cold spells. The parcels are embedded in a large-scale "omega" flow pattern over Europe and follow a pronounced continental northerly pathway (Fig. 10a, b). Furthermore, the initial potential temperature and temperature of the air parcels are much lower than the climatology of cold spells by about 8 K and 6 K, respectively (cf. Fig. 10d, e with Fig. 3d, e). The largest d and θ values computed on SLP coincide with the arrival of the majority of air parcels over the Mediterranean (Fig. 10b), and the time at which they substantially increase their potential temperature, temperature and specific humidity ( Fig. 10d-f). Indeed, storm 'Alexa' is situated in the upper decile of change in specific humidity and potential temperature over the 48 h prior to the event onset, which may reflect on the dynamical systems metrics values for this event (see Sect. 3.2 and Fig. 7). Figure 11 shows the temporal evolution of the forecast spread/skill for the 'Alexa' storm compared to the climatology of cold spells. Throughout the lead up and early phases of the event, the forecast displays a higher error than for other cold spells (Fig. 11b). On the contrary, the spread peaks at 48 h before the event onset, but displays mostly lower values than other cold spells at other times (Fig. 11a, c). The temporal evolution of the SLP ensemble Fig. 7 The difference in the de-seasonalized dynamical systems metrics computed on SLP for Cyprus Lows split into the upper (solid blue lines) and lower 10% (solid red lines) of change in specific humidity over the 48 h before the event onset. a d (local dimension) b θ (1/persistence). Storm 'Alexa' is marked with a black arrow. Significant differences between the CDFs and medians are marked with p < 0.05 spread mirrors quite closely that of the dynamical systems metrics computed on SLP, peaking in the two days prior to the event's onset (cf. Fig. 11c with Fig. 6a). Thus, the analysis of the 'Alexa' cold spell supports the results for the whole sample of cold spells associated with snow in Jerusalem. The corresponding plots for forecast valid time (see Sect. 2.4), are provided in Figure S6. While we argue (Sect. 2.4) that forecast initial time is closer to the information synthesised by the dynamical system metrics, we note that the picture changes when using forecast valid time, as a dip in SLP spread is present at small negative lags.

Summary and conclusions
We use a combination of dynamical systems theory, numerical weather forecasts and Lagrangian parcel tracking to investigate the evolution and predictability characteristics of Eastern Mediterranean cold spells leading to snow in Jerusalem. We compare these to 'regular' Cyprus Lows-the dominant precipitation-bearing wintertime systems in the region. The choice of extreme event is motivated by the fact that Eastern Mediterranean cold spells are considered difficult to predict by the region's forecasters (Wolfson and Adar 1975;Bitan and Ben-Rubi 1978;Goldreich 2003).
Significant differences are found between cold spells and 'regular' Cyprus Lows from a dynamical systems perspective. When considering SLP, the intrinsic predictability of cold spells is lowest, i.e., comparatively high local dimension (d) and low persistence (θ −1 ), at the time when the majority of air parcels associated with the inflow of cold air interact with the Mediterranean Sea. At this time, the parcels rapidly increase their temperature, potential temperature and specific humidity. This is likely related to intense air-sea fluxes, which play a prominent role in the generation and intensification of Cyprus Lows, thus influencing the ability to predict them (e.g., Davis and Emanuel 1988;Alpert et al. 1995). In this study, we provide new evidence for the important role that air-sea fluxes may play in determining the intrinsic predictability of Cyprus Lows and particularly cold spells. The dynamical systems metrics computed on Z500 display a different temporal evolution to their SLP counterparts, highlighting the different characteristics of the atmospheric flow at the different levels. We further test the sensitivity of the dynamical systems metrics to the location and depth of Cyprus Lows, and conclude that both play a central part in determining the intrinsic predictability.
The detailed case-study of storm 'Alexa' provides further insights. While following a similar evolution to other cold spells (albeit with larger anomalies), 'Alexa' was poorly forecasted (Hochman et al. 2019), and we argue that the simple and inexpensive computation of the dynamical systems metrics and a comparison of their evolution to that seen during previous cold spells, could have served as a red warning light for the region's forecasters on the unusualness of the event. Specifically, signatures of regional extreme weather events might be reflected in the dynamical systems metrics, even in cases when the actual extreme is not well-simulated by numerical forecasting systems. We thus argue that this approach provides a complement to more conventional analyses based on atmospheric dynamics and ensemble numerical weather forecasts. This is particularly true in a warming world, a fact which adds another level of complexity to the tasks at hand (e.g., Faranda et al. 2019b;Hochman et al. 2020).
Finally, we tested the sensitivity of the dynamical systems metrics to the size of the domain they are computed on. We conclude that in our case the dynamical systems metrics are largely insensitive to domain size, as long as the main features of the weather systems of interest are captured. A logical next methodological step would be to compute the dynamical systems metrics for cyclonic events in a Lagrangian manner, i.e., tracking specific weather systems, in order to avoid biases linked to the choice of geographical domain.
While providing numerous insights on Eastern Mediterranean cold spells, our analysis has also highlighted how the interpretation of the dynamical systems results poses some challenges. When interpreting the dynamical systems characteristics of Cyprus Lows/cold spells, a complex picture arises. For example, the depth/location of Cyprus Lows and the magnitude of air-sea fluxes influence d and θ −1 in apparently inconsistent ways. Moreover, the link between the physical properties of a cyclone and its predictability depends on the atmospheric level, the variable(s) being considered, and on the specific property to be evaluated. The synergistic effects of these different factors were not evaluated here and will be dealt with in a follow-up study.
Similar difficulties ar ise when relating the dynamical systems' intrinsic predictability to the Fig. 9 A dynamical systems analysis for the 'Alexa' cold spell. Sea level pressure (shading in hPa) and 500 hPa geopotential height (white contours in m) on 12 December 2013 at 00UTC (a). The deseasonalized dynamical systems metrics' temporal evolution centred on 12 December 2013 at 00 UTC computed on: b sea level pressure-SLP and c 500 hPa geopotential height-Z500 ensemble forecasts' practical predictability. While the differences between the two may be partly ascribed to the different nature of the two quantities, they may also depend on shortcomings in our interpretation of the dynamical systems metrics or on shortcomings of the GEFS ensemble data. The latter ensemble, as most numerical ensemble forecasts, is under -dispersive (e.g., Kuene et al. 2014). In other words, the spread of the ensemble members does not always reflect the range of possible future evolutions of the atmosphere, making the concept of practical predictability elusive.
To conclude, the two instantaneous dynamical systems metrics adopted in this study, and specifically their temporal evolution, can provide a valuable complement to conventional analyses for the study of cold spells and their predictability. At the same time, care must be taken in interpreting them. Due to its computationally inexpensive nature, we are convinced that a similar approach may be fruitfully applied to other geographical regions and weather extremes.

Fig. 10
Backward trajectory air parcel tracking for the 'Alexa' cold spell initialized on 12.12.2013 at 00 UTC (0 h in the panels) with a circles indicating days (from − 10 days to − 6 days before 12.12.2013 at 00 UTC), trajectory density 10 days before onset (shading in number of trajectories per 1000 km 2 ), stream lines of 800 hPa wind (averaged between − 10 days and − 6 days before 12.12.2013 at 00 UTC). b as in a, but for − 5 days to − 1 day and trajectory density 5 days before onset (shading in number of trajectories per 1000 km 2 ). Median evolution of c height (hPa), d potential temperature (K), e temperature (K) and f specific humidity (g kg −1 ) of the tracked air parcels. The inter-quartile range is plotted for the physical properties of the air parcels Here, we provide a brief overview of the computation of the d and θ −1 metrics used in the study. The metrics are issued from a dynamical system view of the evolution of our atmosphere, where we interpret each variable (strictly speaking, observable) as a special Poincaré section of the full atmospheric dynamics. The temporal evolution of a given variable in a given geographical domain over time may then be represented as a long trajectory in phase-space. In turn, each latitude-longitude map for the chosen variable and at a given time corresponds to a single point along the trajectory, for which local properties can be computed. Locality in phase space is thus equivalent to instantaneity in time.
We consider an observed trajectory x(t)-for example a succession of daily SLP maps-and a state of interest ξ, the latter corresponding in physical space to a latitude-longitude map of the chosen variable at a given time, and in phase-space to a given point along the variable's trajectory. We then characterise the recurrences of the system around ξ. In other terms, we search for timesteps at which the SLP displays similar latitude-longitude maps to that of the state of interest. To identify these, we define the observable function g: g(x(t), ) = −log(dist(x(t), )) where dist is a distance function (in our case the Euclidean distance between the SLP maps). The application of the negative logarithm increases the discrimination of close recurrences, and also implies that g is large when dist is small. Recurrences may thus be identified based on g exceeding a high threshold s. Here, we chose s as the 98 th percentile of g(x(t), ξ) itself, thus making s a function of ξ and of the chosen percentile q. For all g(x(t), ) > s(q, ) , namely a recurrence, we can then define exceedances u as: Following the Freitas-Freitas-Todd theorem (Freitas et al. 2010), modified in Lucarini et al. (2012), the cumulative probability distribution F (u, ξ) converges to the exponential member of the Generalised Pareto Distribution: whose parameters depend on the chosen ξ. When applying this approach to a given dataset, each timestep-in our example each SLP latitude-longitude map-serves in turn as a state of interest. Then, g can be evaluated for that latitude-longitude map and the maps at each of the other timesteps in the dataset, and the local (instantaneous) dimension d(ξ) is given by: If x(t) contains the system's full set of phase-space variables, then d is independent of the chosen dist for all ξ. In our practical application, this is naturally not the case and different choices of dist will yield different values of d. It is therefore important to interpret d in a relative, as opposed to absolute sense.
We can further compute the persistence of the state ξ from the a-dimensional extremal index ϑ (Moloney et al. 2019). We define the average residence time of the phase-space trajectory around ξ, or in other words the state's persistence, as: Here, Δt is the time interval between successive timesteps in our dataset, for example 1 day, and −1 is the persistence in the same units as the timestep. We estimate the extremal index following Süveges (2007).
The above procedure results in a value of d and θ for every time step in our dataset. For the limits of applicability to non-stationary systems and to comparatively short time series, the reader is referred to Freitas et al. (2017) and Buschow and Friederichs (2018).