Assessment of an ensemble of ocean–atmosphere coupled and uncoupled regional climate models to reproduce the climatology of Mediterranean cyclones

This study aims to assess the skill of regional climate models (RCMs) at reproducing the climatology of Mediterranean cyclones. Seven RCMs are considered, five of which were also coupled with an oceanic model. All simulations were forced at the lateral boundaries by the ERA-Interim reanalysis for a common 20-year period (1989–2008). Six different cyclone tracking methods have been applied to all twelve RCM simulations and to the ERA-Interim reanalysis in order to assess the RCMs from the perspective of different cyclone definitions. All RCMs reproduce the main areas of high cyclone occurrence in the region south of the Alps, in the Adriatic, Ionian and Aegean Seas, as well as in the areas close to Cyprus and to Atlas mountains. The RCMs tend to underestimate intense cyclone occurrences over the Mediterranean Sea and reproduce 24–40 % of these systems, as identified in the reanalysis. The use of grid nudging in one of the RCMs is shown to be beneficial, reproducing about 60 % of the intense cyclones and keeping a better track of the seasonal cycle of intense cyclogenesis. Finally, the most intense cyclones tend to be similarly reproduced in coupled and uncoupled model simulations, suggesting that modeling atmosphere–ocean coupled processes has only a weak impact on the climatology and intensity of Mediterranean cyclones.

In such a unique environment, cyclogenesis is frequent, placing the Mediterranean among the regions with the highest occurrence of cyclones in the world (Neu et al. 2013).
Intense cyclones in the MR mainly form due to synoptic scale atmospheric systems, such as upper-tropospheric Rossby waves and stratospheric air intrusions (Fita et al. 2006;Claud et al. 2010;Flaounas et al. 2015a). They mainly occur in winter and autumn and they are favored at the leeward side of the Alps and Atlas Mountains, over the Gulf of Lions, the Ionian, Adriatic and Aegean Seas (e.g., Alpert et al. 1990;Trigo et al. 1999;Maheras et al. 2001;Campins et al. 2011;Lionello et al. 2016). These cyclones have a strong impact on the Mediterranean climate, its variability and extremes (e.g., Trigo et al. 2000;Lionello et al. 2006;Gaertner et al. 2007;Lionello and Giorgi 2007;Walsh et al. 2014;Zappa et al. 2015). Indeed, several studies showed the significant contribution of cyclones to the majority of rainfall and wind extremes (e.g., Jansà et al. 2001;Nissen et al. 2010;Reale and Lionello 2013;Raveh-Rubin and Wernli 2015), to the formation of the most prominent high impact weather events (Llasat et al. 2010), as well as to the modulation of the Mediterranean hydrological cycle (Romanski et al. 2012;Flaounas et al. 2015b).
Given their importance to the Mediterranean climate, cyclones proper representation by models is crucial for studying climate dynamics and impacts. To this end, reanalysis data is considered to provide realistic results due to the assimilation of a plethora of observations into atmospheric models. However, global reanalyses are typically available with relatively coarse resolutions that prohibit the detailed reproduction of cyclone meso-scale dynamics and weather extremes. To address this issue, RCMs have been long ago employed to analyze climate dynamics across different spatial scales (e.g., Giorgi 1990;Deque and Piedelievre 1995). In fact, several recent studies demonstrated the benefits from the use of RCMs in reproducing climate patterns at local scales within the MR and for improving the quality of input data for impact studies in areas of complex geography (Flaounas et al. 2013;Guyennon et al. 2013;D'Onofrio et al. 2014;Calmanti et al. 2015). Despite their ability to resolve fine-scale atmospheric features, RCM results are associated with significant uncertainties. For instance, RCMs are known to be particularly sensitive to the physical parameterization of atmospheric processes, to the boundary conditions, horizontal/vertical resolution (e.g., Roeckner et al. 2006;Sanchez-Gomez et al. 2011;Herrmann et al. 2011;Flaounas et al. 2011;Di Luca et al. 2014), and internal variability (e.g., Christensen et al. 2001;Sanchez-Gomez and Somot, 2016). Especially the latter is a primary source of uncertainty and leads to spread in RCM behavior under the same external conditions.
One of the specific objectives of the Coordinated Regional Downscaling Experiment (CORDEX) initiative is to provide a coordinated framework for the systematic investigation of RCM uncertainties in order to provide meaningful atmospheric data on fine scales (Giorgi et al. 2009). The MED-CORDEX initiative is the declination of CORDEX devoted to the MR (www.medcordex.eu; Ruti et al. 2015). In line with the objectives of CORDEX, in this study we assess the capacity of twelve RCM simulations to realistically reproduce Mediterranean cyclones, especially the most intense systems, which are also expected to provoke the majority of regional climate extremes and the strongest socio-economic impacts. These simulations are produced by seven different models, where two models have been used as stand-alone atmospheric RCMs, while the other five have been used both as atmospheric RCMs and as interactively coupled RCMs with oceanic models.
This study aims first at contributing to the better understanding of RCM deficiencies when used for climatological studies on cyclone-related impacts and dynamics. Second, Fig. 1 The Mediterranean domain that has been used for the cyclone tracking (black box). The shading indicates terrain elevation (in m). The grey thick line depicts the subjectively drawn track of the December 2005 medicane, using the SLP from ERAI. The colored lines show the medicane tracks, as diagnosed by the different tracking methods (described in the Appendix) this study assesses the capacity of RCMs to realistically reproduce intense Mediterranean cyclones, and finally it investigates the potential added value in using coupled models for reproducing intense Mediterranean cyclones. The RCMs assessment is achieved by comparing their cyclone climatologies to the one produced by the reanalysis, which also provided the RCMs initial and lateral boundary atmospheric conditions. The next section presents the models and the cyclone tracking methods. Section 3 provides a cyclone climatology comparison between ERA-Interim (ERAI; Dee et al. 2011) and the MED-CORDEX models and finally, in Sect. 4 we summarize our methods and results, and present the main conclusions of this study.
2 Models, cyclone tracking and methodology 2.1 Regional climate models and tracking methods All simulations used in this study have been performed within the framework of MED-CORDEX, for the period 1989-2008 and their domains cover the broader MR with small variations in their size. The RCMs initial and lateral boundary conditions have been taken from ERAI at a grid spacing of 0.75° × 0.75° in longitude and latitude, every 6 h. The time varying field of sea surface temperature (SST) was also taken from ERAI for the uncoupled simulations. In addition, the simulations for five RCMs have been repeated with the SST fields explicitly calculated by coupled high-resolution regional oceanic models. All RCMs included in this study present a variety of different characteristics such as different horizontal resolutions, number of vertical levels and physical parameterizations. In particular, the WRF model is the only model that applies grid nudging to wind, temperature and water vapor on all atmospheric levels except of those within the planetary boundary layer. Table 1 presents the details and references of the simulations and models used in this study.
To identify and track cyclones, six different cyclone tracking methods have been applied, based on either sea level pressure (SLP) or relative vorticity at 850 hPa (RV), i.e., the two atmospheric fields that are commonly used for tracking cyclones in gridded datasets (Hoskins and Hodges 2002;Neu et al. 2013). SLP is a low frequency field that reflects the atmospheric mass distribution, and is representative mainly of synoptic-scale atmospheric processes. On the other hand, relative vorticity is a high frequency field that is representative of the atmospheric circulation. Method 1 ) and method 2 (Ayrault and Joly 2000) are both RV-based methods, identifying cyclones as local maxima of RV at 850 hPa. Methods 3-6 are all based on SLP. In particular, method 3 (Kelemen et al. 2015) and method 6 ) identify cyclones as local SLP minima, while method 4 (Lionello et al. 2002;Reale and Lionello 2013) and method 5 (Wernli and Schwierz 2006) perform a processing of SLP fields prior to determining cyclone centers as local SLP minima. All methods are described in detail in the Appendix and have been previously applied in the MR, among other methods that were used to study the MR climatology of cyclones (e.g., Murray and Simmonds 1991;Trigo et al. 1999;Pinto et al. 2005;Bartholy et al. 2009;Campins et al. 2011). Our motivation in using a variety of cyclone tracking methods is the fact that their results may present significant variability, even when considering same periods and input datasets (e.g., Bartholy et al. 2009;Campins et al. 2011;Lionello et al. 2016). Uncertainties exist due to differences in the mathematical and physical definitions of cyclones among the methods, as well as due to the tracking tools' sensitivity to several factors such as the horizontal grid spacing of the input datasets (Kouroutzoglou et al. 2011;Hanley and Caballero 2012;Rudeva et al. 2014). The sensitivity of extratropical cyclone climatologies due to the use of different tracking methods has been thoroughly analyzed within the framework of the IMILAST project (Neu et al. 2013), while Lionello et al. (2016) specifically addressed this question for the MR. Figure 1 provides an example of the uncertainty in cyclone tracking due to the use of different tracking methods. It illustrates the tracks of a medicane (merging of the words Mediterranean Hurricane) that occurred in December 2005 (Fita et al. 2007). The tracks are calculated by the six tracking methods, applied to ERAI, and are compared to the manually analyzed track, by connecting the ERAI local SLP minima at consecutive times. It is evident that methods present a common part of their tracks with the reference, i.e., they all succeeded to capture the cyclone's eastwards displacement over the Mediterranean Sea. However, the lifetime (equivalent to the number of track points), the initial and final location, as well as the cyclone's exact path differ from method to method. The use of multiple tracking methods constitutes an added value permitting the interpretation of RCM performances in reproducing cyclone tracks from different physical perspectives (Neu et al. 2013).
For the assessment of the models' capacity to reproduce the Mediterranean cyclone climatology, we use the ERAI reanalysis as a reference. Therefore, we assess the realistic reproduction of the cyclone tracks by the RCMs, as well as the divergence of their results with respect to the driving dataset. In order to perform a meaningful and fair intercomparison between the models and ERAI, all 6-hourly RCM outputs of SLP and wind (used to calculate relative vorticity) have been upscaled to the ERAI grid (at 0.75° spatial resolution). Upscaling regional models is opposed to the motivation of performing climate downscaling, however it provides a straightforward and fair intercomparison of cyclone climatologies between RCMs and ERAI (Kouroutzoglou et al. 2011). Upscaling RCM fields (with resolutions of 20-50 km; Table 1) is expected to have a detrimental impact only on the detection of small and weak cyclonic features with characteristic lengths inferior to ~80 km (this is the average 0.75° × 0.75 grid spacing in the Mediterranean region). Such cyclonic systems unlikely have a strong impact on climate dynamics and extremes.
Cyclone intensity for the RV-based methods 1 and 2 is measured by the maximum RV in the cyclone center. However, methods 1 and 2 apply data filtering before identifying cyclones and hence their cyclone intensity measures are not directly comparable to the other methods. To overcome this inconsistency, the cyclone track points from methods 1 and 2 have been re-attributed to the raw relative vorticity values of the RCM simulations (after regridding to the ERAI grid). For SLP-based methods 3-6, cyclone intensity is determined by the minimum central SLP value along the cyclone track. All six tracking methods are applied to all RCM outputs, as well as to ERAI for the 20-year period of 1989-2008 and within the domain shown in Fig. 1. For all methods, we retain only the cyclones that exist for at least one day (i.e., tracks with at least five 6-hourly track points), as also done in Neu et al. (2013).

Using a multi-method approach for tracking cyclones
The 20-year ERAI Mediterranean cyclone climatology, as reproduced by the six different tracking methods, is shown in Fig. 2 in the form of maps of cyclone center densities (CCD; Neu et al. 2013). The CCD is defined by the frequency of occurrence of cyclone centers within an area of 500 km 2 in a given 6-hourly time step. Percentages higher than 100 % denote the existence of more than one cyclone in average in an area of 500 km 2 per time step.  (Kain 2004), the WSM5 microphysics scheme (Hong et al. 2004) and the RRTMG longwave and shortwave radiation scheme (Clough et al. 2005) IPSL-WRF311NEMO The WRF atmospheric model coupled with the oceanic model NEMO (Drobinski et al. 2012;Lebeaupin Brossier et al. 2013) ENEA-REGCM3 RegCM version 3 (Giorgi and Mearns 1999;Pal et al. 2007) model at 30 km horizontal resolution and 18 vertical levels ENEA-PROTHEUS Composed of RegCM and the MITgcm developed by Marshall et al. (1997a, b) and validated over the Mediterranean area by Sannino et al. (2009). RegCM3 receives the instantaneous values of SST every 6 h. The MITgcm receives from RegCM3 the 6-hourly cumulated rainfall and the surface forcing fields for the ocean, i.e. wind stress, sensible heat flux, latent heat flux, long and short wave incident radiation (Artale et al. 2010) University of Frankfurt -COSMO-CLM CCLM version 4.21 ) is used for atmospheric-only simulations with horizontal grid resolution of 50 km and 32 vertical levels The RV-based methods 1 and 2 ( Fig. 2a, b) tend to identify more cyclone centers than the SLP-based methods 3-6 ( Fig. 2c-f). All methods capture fairly well the main "hot spots" of Mediterranean cyclones over the leeward side of the Alps, over the Adriatic and Aegean Seas, and near Cyprus (Trigo et al. 1999;Campins et al. 2011). The Ionian Sea is also known to present a high concentration of cyclones, however in contrast to the other methods, methods 4 and 6 show no distinct hot spot of CCD in this area. It is noteworthy that the CCD maximum south of the Alps and near Cyprus are located more to the west in the RVbased methods, compared to SLP-based methods. This is mainly due to wind steering at the mountain flanks, which tends to displace the RV local maxima to the west. On the other hand, SLP-based methods may capture the SLP minima also over the mountains, where surface pressure is extrapolated to sea level. The Northwest African cyclones, known as Sharav cyclones, which form near the Atlas mountain chain (Ammar et al. 2014), are also captured by methods 1, 2 and 5, while method 4 presents no particular local maximum of CCD in this area. Method 3 uses a spatial filtering that excludes the domain boundary grid points. Consequently, most of North Africa is not taken into account by this method (see Appendix). Table 2 shows significant differences in the number of cyclones captured by each method when applied to ERAI. Differences in cyclone numbers may not be regarded as a token of good or poor skill of a cyclone tracking method, but it rather shows that the cyclone number depends significantly on the adopted physical definition of a cyclone by each method. Figure 3a shows the cumulative distribution functions of cyclone lifetime in ERAI and Fig. 3b shows the cumulative Fig. 2 Cyclone center density as diagnosed by the six different tracking methods during the whole period 1989-2008 in ERAI. White areas correspond to values of less than 1 % distribution functions of cyclone maximum intensity. Lifetime and maximum intensity strongly depend on the applied cyclone tracking method. For instance, methods 2 and 4 tend to identify long lasting cyclone tracks ( Fig. 3a) where approximately 30 % of the tracked cyclones last more than 3 days. In contrast, the cyclones with lifetime of more than three days correspond to less than 20 % of all cyclones detected by method 5 and about 5 % of all cyclones detected by methods 1, 3 and 6. A plausible explanation for the variable cyclone lifetimes is the fact that different cyclone definitions by the tracking methods may result in capturing cyclones at different stages of their evolution. Concerning the cumulative distribution function of cyclones' maximum intensity (Fig. 3b), the intensities in the RVbased methods 1 and 2 are fairly equally distributed. For both methods the 20 % of the most intense cyclones present RV values of more than 12 × 10 −5 s −1 , while less than 20 % of the tracked cyclones present intensities weaker than 6 × 10 −5 s −1 . Also all SLP-based methods are in fair agreement. Minimum pressure for the 20 % of the cyclones is less than 997 hPa in methods 3 and 6 and in methods 4 and 5 is less than 1000 hPa. On the other hand, the 20 % of the cyclones have core pressures of more than 1007 hPa for methods 3 and 6 and more than 1010 hPa for methods 4 and 5. It is noteworthy that by design method 3 detects cyclones as local SLP minima of less than 1013 hPa.
In this study we focus on the most intense Mediterranean cyclones, however the cyclones' intensity might be sensitive to the ERAI grid spacing, for both SLP and RVbased cyclone tracking methods (Murakami and Masato 2010;Kouroutzoglou et al. 2011). To avoid defining arbitrary thresholds on cyclone intensity and in order to analyze a statistically robust ensemble of intense Mediterranean cyclones, in this study we choose to analyze the 500 most intense cyclones from each method. We thus form a set of intense cyclones for ERAI and one for each RCM. These datasets each contain a total of 3000 intense cyclones (i.e., 500 cyclones from each cyclone tracking method). To simplify the analysis, we retain only the mature stage of the cyclones (the time and location of the track points with minimum SLP or maximum RV). Note that in each dataset the same cyclone may be detected by more than one method. We remove these multiples by considering that two cyclones are the same if their centers are located within no more than 3° in longitude and latitude, and if both cyclones occur within a time difference of no more than 12 h. This methodology to diagnose "identical" cyclone tracks is simpler than the one applied by Blender and Schubert (2000) or Froude et al. (2007), who used a criterion based on the whole cyclone tracks. However, here we treat a complex dataset with tracks reproduced from different methods and hence of different track characteristics (e.g., track lengths of the same cyclones might be considerably different; Fig. 3a). Removing multiples leads to 1600-1800 intense cyclone tracks per dataset (hereinafter referred to as the 500 most intense cyclones). Figure 3c shows the seasonal cycle of the 500 most intense cyclones in ERAI. All methods show a lower frequency of intense cyclones in summer and a higher one in spring and winter, in agreement with previous studies of intense Mediterranean cyclones (e.g., Campins et al. 2011;Cavicchia et al. 2014;Flaounas et al. 2015b).

Assessment of RCMs to reproduce cyclone climatology in the Mediterranean region
In order to assess the RCMs capacity to reproduce the Mediterranean cyclone climatology, we first calculate the CCD of each RCM and then compare it to the CCD of ERAI (Fig. 2). Online Resources 1 presents the CCD through the perspective of each cyclone tracking method and for each RCM. Compared to ERAI (Fig. 2), all RCMs tend to capture the main regions of frequent cyclogenesis (near the Alps, over the Adriatic, Ionian and Aegean Seas and the marine areas close to Cyprus). To statistically quantify the RCMs performance, Fig. 4 shows Taylor diagrams (Taylor 2001) that compare the CCD of the RCMs with regards to the CCD of ERAI, i.e., the CCDs as shown in Fig. 2 and in Online Resources 1, respectively. The correlation of the CCDs between the RCMs and ERAI is represented by the angle with the vertical axis, the corresponding centered root mean square error (RMSE) is represented by the distance from the black dot (representing ERAI), while the distance from the origin represents the standard deviation of the RCM (or ERAI). Since the CCD is not homogeneous in the region, the correlation, the RMSE and the standard deviation of each RCM reflect the models' capacity to properly represent the level and location of the regional Fig. 3 Cyclone track characteristics with respect to the applied cyclone tracking method. a Cumulative distribution function of the cyclone lifetimes. b Cumulative distribution function of the cyclone intensities. RV-based methods 1 and 2 refer to the right axis, SLP-based methods 3-6 refer to the left axis. Note that the strongest cyclones are represented in the RV-based methods by the highest percentiles and in the SLPbased methods by the lowest percentiles. c Seasonal cycle of the occurrence of the 500 most intense cyclones CCDs "hot spots" compared to ERAI. Several methods are sensitive to identifying cyclones close to the domain lateral boundaries, especially method 4, and hence the analysis has been restricted to the region from 30°N to 46°N and from 5°W to 37.5°E. The results when using the RV-based methods 1 and 2 show that the RCMs present a small statistical spread (i.e., the models have similar values of the RMSE and standard deviation). The results of method 1 show that almost all RCMs have CCD RMSE values smaller than 15 %, standard deviations that are close to the reanalysis (about 15 %) and correlation scores between 0.6 and 0.95. On the other hand, according to method 2, the RCMs present correlation scores between 0.4 and 0.8, as well as RMSE values of more than 20 %. Methods 1 and 2 reproduce more cyclones than the other methods, reflecting their ability to also capture weak cyclonic circulations. The similar spread of RCMs when using these methods suggests that all models perform similarly in capturing these features. However, the poorer statistical scores for method 2, than for method 1, are most likely due to very weak cyclones that are filtered out in method 1 (RV less than 3 × 10 −5 s −1 ) and which are expected to be less consistent between the RCMs and the reanalysis. Indeed, the reproduction of weak and small cyclones are expected to be particularly sensitive to model resolution and thus are expected to be more frequent in the cyclone climatology of RCMs than of ERAI.
Methods 3 and 6 detect the fewest cyclones (Table 2) and therefore the model comparison with ERAI in Fig. 4 reveals small RMSE and standard deviation values, compared to the other methods. A low RMSE does not imply that RCMs present "better" results when using these methods. In fact, the RMSE tends to provide relatively small values for weak and smooth fields (as the ones produced by methods 3 and 6) and tends to "punish" models that produce more variable fields with peaks at the wrong locations (the so-called double-penalty problem discussed in the verification literature, e.g., Wernli et al. 2008). In fact, for these methods, the CCD is higher over the Adriatic, Ionian and western Mediterranean seas (Fig. 2, Online resource 1), where the most intense cyclones are expected to occur (Flaounas et al. 2015b). Indeed, methods 3 and 6 apply a rather strong criterion on cyclone intensity, detecting only cyclones with SLP lower than 1013 hPa and with a SLP gradient greater than 0.5 hPa per 100 km −1 , respectively. It is plausible thus to suggest that most of the RCMs capture correctly the occurrence of the most intense cyclones in the central Mediterranean. For methods 3 and 6 only UCLM-PROMES (as well as ENEA-REGCM3 for method 6) does While strong cyclones seem to be adequately captured by all RCMs (Fig. 2 and Online resource 1), method 5 presents a wide statistical spread in the RCMs performance. In fact, method 5 tends to capture the intense winter cyclones with closed contours of SLP and therefore similar statistical scores should be expected by all RCMs in Fig. 4. In contrast, method 5 presents the widest spread of the RCMs performance among the methods, with the COSMO-CLM, the CNRM and CMCC-CCLM having a similar CCD standard deviation as in ERAI (about 20 %) and the highest spatial correlation with the reanalysis. Further analysis suggests that the better statistical scores of these RCMs mostly reflect their capacity to capture the cyclones occurring close to the Alps, the Adriatic Sea and Cyprus. In these areas intense cyclones are most frequent and thus have a strong impact on statistical scores of method 5. Method 4 tends to smooth the CCD spatial variability over these areas ( Fig. 2 and Online Resources 1) and hence it comes as no surprise that the COSMO-CLM, the CNRM and the CMCC-CCLM RCMs also present, as in method 5, the best performance compared to the other models. Figure 4 shows that the coupled versions of the model simulations yield only small deviations of their statistical scores when compared with the atmosphere-only models. Online Resources 2 shows the CCD differences between the atmosphere-ocean coupled and uncoupled model versions. Indeed, the CCD differences exceed 5 % for methods 1, 2 and 5, while differences for methods 3, 4 and 6 are less than 2 %. Such small CCD differences are supported by the fairly equal number of cyclone tracks between the coupled and uncoupled models, regardless of the method used (Table 2). Larger differences are shown in Online Resources 2 for method 5 close to the Turkish Aegean and southern coasts. This suggests that SST differences near these mountainous coasts may affect significantly the pattern of enclosing SLP contours and thus may modulate CCDs between coupled and uncoupled RCMs. This behavior of Method 5 in the eastern Mediterranean is observed for all RCMs.

Intense cyclone climatologies in the RCMs
In this section, we concentrate on the most intense cyclones. In order to assess the RCMs capacity to reproduce the same cyclones as in ERAI, we perform the same procedure as for filtering duplicate cyclones within the same cyclones dataset (Sect. 2.2). However, this time the filtering criteria are used to find equally reproduced cyclones between the RCMs and ERAI. Table 3 shows the percentage of ERAI intense cyclones reproduced by each RCM. Depending on the model, the similarity scores differ significantly between 24 and 40 %, suggesting that RCMs are able to reproduce about one third of the most intense ERAI cyclones. The IPSL-WRF311 model presents the best similarity score (of the order of 60 %) regardless of the cyclone tracking method used. This can be largely attributed to the application of grid nudging to the wind, temperature and water vapor fields above the planetary boundary layer (Omrani et al. 2013). This operation hinders this RCM to diverge from ERAI and therefore the high similarity score is a trivial consequence of nudging. Figure 5 shows the CCD for the most intense cyclones only, as detected in ERAI (Fig. 5a) and in the uncoupled RCMs in the form of differences from ERAI ( Fig. 5b-h).
In accordance with previous studies, the most intense systems develop mainly over the western Mediterranean Sea (at the leeward side of the Alps), over the Adriatic and Ionian Seas, as well as over the Aegean and close to Cyprus (Fig. 5a). Surprisingly all RCMs, except WRF, underestimate the CCD of intense cyclones over the Mediterranean Sea. In contrast, the models tend to overestimate the frequency of intense cyclones over land, as for instance the ENEA-REGCM over the Balkans, and the CMCC-CCLM, WRF and PROMES RCMs close to the Alps. This suggests that the RCMs tend to produce more intense cyclones over land, and in particular close to the mountains. Therefore, fewer cyclones are found over the Mediterranean Sea in the dataset of the 500 most intense cyclones than in the reanalysis. Indeed, further analysis shows that if the 500 most intense cyclones in ERAI are compared to the 1500 most intense cyclones in RCMs (instead of the 500 most intense cyclones), then the RCMs similarity scores in Table 3 are higher by adding about 15 % to each model score. In order to assess statistically the RCMs capacity to properly produce the climatology of intense cyclones, Fig. 6 shows a Taylor diagram for the CCDs of the RCMs compared to ERAI. Results show a similar performance for all RCMs, achieving a high correlation score of about 0.8, similar standard deviations with ERAI and a RMSE of less than Table 3 Percentage of the 500 most intense cyclones, equally present in both ERAI and the MED-CORDEX models Scores correspond to the uncoupled (first row in each cell) and coupled models (second row in each cell) 33  40  63  44  24  29 %  36 %   34  39  64  44  24 --2 %. The exception is PROMES, which presents higher RMSE than the other RCMs, mainly due to the overestimation of CCD over the Alps and due to the underestimation of CCD in the marine areas near Italy.

EBU (%) CNRM-ALADIN52 (%) IPSL-WRF311 (%) COSMO-CLM (%) ENEA-REGCM3 (%) UCLM-PROMES CMCC-CCLM
To assess the models' capacity to reproduce the temporal variability of intense cyclones, we consider all intense cyclones forming in three different sub-regions: the western, central and eastern Mediterranean (boxes in Fig. 5a depict the three subregions). For each of the three subregions, we consider time series of the monthly counts of intense cyclones from 1989 to 2008. The results are shown in Fig. 7 in the form of Taylor diagrams, corresponding to the comparison of the cyclones inter-monthly variability between the RCMs and ERAI, in each of the three subregions, as well as in the whole Mediterranean domain. The small statistical spread of all models in Fig. 7 suggests  Fig. 3c). Indeed, all models perform similarly, with correlation scores exceeding 0.6 and standard deviations close to ERAI. The ENEA RCM systematically presents higher RMSE than the other RCMs, while the RMSE of WRF is the lowest. The contrasting performance of ENEA-REGCM and WRF is consistent with the lower and higher similarity scores of these RCMs in Table 3. The difference between the RMSE scores of the two RCMs is due to the good and poor skill in reproducing the cyclones seasonal cycle. In fact, the use of nudging in WRF benefits the RCM to reproduce realistically the ERAI intense cyclones (63 % similarity score; Table 3) and hence to better capture the cyclones' inter-monthly occurrence. On the other hand, the ENEA-REGCM presents about 25 % of similar cyclones to ERAI and thus it is plausible to suggest that the rest of the cyclones are not following a consistent seasonal cycle as in ERAI. Indeed, further analysis shows that applying methods 4, 5 and 6, ENEA-REGCM tends to overestimate (underestimate) summer (winter) cyclone occurrences (not shown).
All RCMs present similar statistical scores for their coupled and uncoupled versions, whether if we consider  (Fig. 4) or only the most intense ones (Figs. 6, 7). The small statistical differences between the cyclone climatologies in coupled and uncoupled RCMs suggest similar spatiotemporal variability of cyclone tracks. This is consistent with the results presented in Table 3, where fairly equal similarity scores are shown for both versions of the five coupled/uncoupled RCMs. In fact, when comparing intense cyclones between the coupled and uncoupled versions of the RCMs, the similarity scores increase dramatically. Table 4 shows that these scores might reach an average of more than 60 %, while the application of grid nudging in WRF results in the reproduction of 89 % of similar cyclones between its coupled and uncoupled version. The weak contribution of air-sea coupling to the spatiotemporal occurrence of intense Mediterranean cyclones is however a rather expected result. Concerning intense cyclones, upper-tropospheric forcing from troughs and cut-off systems plays the primary role for their intensification (Fita et al. 2006;Claud et al. 2010;Flaounas et al. 2015b). The role of SST is certainly crucial in altering the baroclinicity at the lower atmospheric levels and therefore for affecting cyclones dynamics. However, the air-sea coupling effect in RCMs is associated with rapid air-sea exchanges that have an impact on the local scale patterns of SST rather than its seasonal and/or large scale distribution. Consequently it is unlikely that coupled models significantly affect atmospheric baroclinicity and/or synoptic scale systems and therefore play a crucial role for Mediterranean cyclones. This is supported by the results of Sanna et al. (2013) who used high-resolution coupled and uncoupled simulations to study the cyclone climatology in the Mediterranean. They used tracking method 4 and found significant differences between the two simulations only for short-lived cyclones with lifetimes of less than a day. Such cyclones have been however excluded in this study and would certainly correspond to weak systems.
Finally, we assess the RCMs capacity to reproduce realistically the intensity of the selected most intense cyclones. Figure 8 presents boxplots of cyclone intensity for each of the three sub-regions, for all RCMs and ERAI. Due to the different intensity metrics among the cyclone tracking methods, we separate RV-based methods 1 and 2 and SLP-based methods 3-6. Regardless of the sub-region, all intense cyclones in ERAI have SLP minima between 990 and 1005 hPa. Cyclones in the eastern Mediterranean tend to be weaker with a median of about 1000 hPa, compared to 997 hPa for the western and central Mediterranean cyclones. In terms of RV, the ERAI cyclones in the eastern and the central Mediterranean tend to show fairly similar intensity distributions with median and extremes (95th percentile) reaching 17 and 22 × 10 −5 s −1 , respectively, while western Mediterranean cyclones tend to be weaker with median and extremes of the order of 16 and 20 × 10 −5 s −1 , respectively. For some RCMs, cyclone intensities in the RV-based methods are shown to be more intense in the Eastern Mediterranean, while they tend to be weaker if SLP-based methods are considered. A plausible explanation is the impact of orography to RCM resolved cyclone dynamics. More detailed orographic features may increase wind speeds and wind steering, resulting to higher RV at the cyclone centers, especially in the eastern Mediterranean domain, large part of which is covered by mountains. In all sub-regions, the ENEA, CMCC-CCLM and WRF RCMs distributions of SLP and RV are displaced to higher intensities with respect to ERAI, suggesting the production of more intense cyclones than in the reanalysis. On the other hand, the EBU, CNRM, COSMO-CLM RCMs present intensity distributions that are fairly close to ERAI in terms of median and extremes. It is noteworthy that PROMES presents more intense cyclones when considering their median of SLP, while in terms of RV its intensity distributions are closer to the ones of ERAI.
Again, the impact of air-sea coupling on cyclone intensity is rather insignificant for most models (Fig. 8). Significant differences between the structures of intensity distributions might be observed for specific models and sub-regions as for instance between the RV median of COSMO-CLM and ENEA-REGCM coupled and uncoupled versions (Fig. 8f). Thus, it seems that our results converge to the fact that the air-sea coupling effect is rather weak for the cyclones' spatiotemporal variability and intensities. In fact, the impact of SST to cyclone dynamics has been thoroughly investigated in past papers, where positive and negative SST anomalies have been applied (Homar et al. 2003;Katsafados et al. 2011;Miglietta et al. 2011;Tous et al. 2012). Results showed that when applying significant SST anomalies (e.g., +4 K) there were indeed considerable changes in cyclone dynamics and tracks. However, the coupled RCMs used here were shown to present an average SST bias of less than 1 K (Ruti et al. 2015) and therefore coupling could not be expected to have a strong impact on cyclones intensity. In addition, the weak impact of coupling air-sea interactions on cyclone intensity and tracks maybe due to the RCMs resolution. For instance, Akhtar et al. (2014) showed that the coupling effect in medicanes becomes significant for model grid spacings of the order of 10 km, while it is rather weak for coarser grids. Similar results have been reached by Gaertner et al. (this issue) who showed that improved horizontal resolution is important for RCMs to reproduce the most intense Mediterranean cyclones. It seems that the original resolution of the models (grid spacing of 20-50 km) is rather coarse for resolving realistically the feedback of air-sea interaction to cyclone dynamics.

Summary and discussion
In this study we analyzed the climatology of Mediterranean cyclones for the 20-year period of 1989-2008, as represented by ERAI and twelve RCM simulations. The simulations have been performed in the framework of the MED-CORDEX project by seven atmospheric RCMs. Five models were also coupled with an oceanic model. Our results are based on the application of six cyclone tracking methods, which emphasize different aspects of cyclone characteristics. Consequently, our study permits a wide assessment of the model's performance, covering multiple aspects of cyclone-related physics. This avoids false conclusions based upon a single, subjectively chosen analysis method. The analysis of the tracks investigated the cyclones' spatial distribution, seasonal cycle and intensity evolution during their lifetime. Our main results are the following: • Considering the RCMs capacity to realistically reproduce the climatology of Mediterranean cyclones, the interpretation of the results may vary according to the tracking method. With RV-based method 1, all RCMs have higher scores than with the RV-based method 2, which suggests that the RCMs tend to reproduce more accurately the stronger cyclones since the weaker ones are filtered out in method 1. • Comparing the performance of RCMs using different SLP-based methods, the results become less homogeneous (Fig. 4). Methods 3 and 6, which tend to identify only the most intense cyclones, show a rather robust performance of all RCMs (except of PROMES and ENEA-REGCM, which have a relatively high RMSE). For methods 4 and 5, the RCMs COSMO-CLM, CNRM and CMCC-CCLM perform best with the lowest RMSE, high correlation scores and standard deviation equal to ERAI. The realistic reproduction of cyclones near the Alps is the key feature that determines the good performance of these 3 RCMs. • Focusing on the most intense cyclones, all RCMs reproduce the spatial distribution of their tracks. However we show that the RCMs tend to produce more intense cyclones over land areas. It is noteworthy that RCMs have higher horizontal resolutions than ERAI and thus offer the potential of resolving finer scale atmospheric dynamics (e.g., related to interactions with orography). The feedback of fine scale atmospheric processes to meso-scale systems such as Mediterranean cyclones might have a non-detrimental effect on cyclone characteristics such as intensity and tracks. For instance, the RCMs tendency to produce more intense cyclones over land could be also attributed to the models' capacity to resolve more realistically, e.g., the impact of orography on cyclone dynamics. Consequently, our results concerning the reproduction of intense cyclones in the RCMs should not be regarded as a metric of the RCMs deficiency when compared to ERAI, but rather as the RCMs behavior when reproducing dynamics related to intense cyclones. • Nudging is beneficial for reproducing the ERAI cyclones. Indeed, the WRF model, which used grid nudging reproduced about two thirds of ERAI intense cyclones, compared to the other RCMs that reproduced about one third. Grid nudging was also beneficial for WRF to perform closest to ERAI in terms of reproducing the cyclones' seasonal cycle. This performance of WRF, differing from the other RCMs, is due to the relaxation of its fields towards ERAI (for wind, temperature and water vapor), which in turn also served as our reference dataset. Therefore, for reasons of consistency the WRF nudged model could not be taken into account for a fair RCM intercomparison. • Air-sea coupling was shown to have a rather weak impact on the cyclone climatology and their intensities. Concerning the cyclone climatology, the weak effect of air-sea interactions is explained by the fact that cyclo-genesis of intense cyclones is mainly due to upper tropospheric forcing. On the other hand, the weak effect of air-sea interactions on cyclone intensities could be due to the resolution of RCMs that is anyway coarser than what is typically needed to resolve such fine scale processes, as well as their feedback to the dynamics of the cyclonic systems.
Assessing the quality of RCMs reproduce prominent atmospheric flow systems such as cyclones is of paramount importance for understanding the model uncertainties in climate processes and extremes and this study shows the potential of tracking methods for a diagnostic description of RCM results. Though cyclone tracks are diagnostics on the spatiotemporal occurrence of cyclones and do not provide direct information on the complex multi-scale physical interactions that have a strong impact on cyclogenesis and cyclone intensification, they may be used to identify the factors responsible for the quality of RCMs results. The present work opens perspectives for the use of tracking methods to interpret RCMs outputs even beyond MED-CORDEX, not only for impact-related studies but also for Mediterranean cyclone dynamics.
Acknowledgments This work is part of the Med-CORDEX initiative (www.medcordex.eu). Med-CORDEX is a coordinated contribution to CORDEX that is supported by HyMeX (www.hymex. org) and MedCLIVAR (www.medclivar.eu) international programs. M.A. Gaertner and R. Romera were funded by the Spanish Government and the European Regional Development Fund, through Grants CGL2007-66440-C04-02, CGL2010-18013 and CGL2013-47261-R. S. Calmanti would like to acknowledge financial support from the EU funded SPECS project (Grant # 3038378). N. Akhtar would like to thank the Center for Scientific Computing (CSC) of the Goethe University Frankfurt am Main and the Deutsches Klimarechenzentrum (DKRZ) for providing computational facilities. M.Reale in the last stage of this work was supported by OGS and CINECA under HPC-TRES program award number 2015-07. Finally, H. Wernli would like to thank M. Sprenger for his help in adapting the cyclone tracking method 5 to data from regional models.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix: Cyclone tracking methods
Method 1-This method is performed in two distinct steps: In step I, all cyclonic centers are identified and located at each 6-hourly model output, for the whole period 1989-2008. Cyclone centers are defined as local maxima of RV within enclosed contours of 3 × 10 −5 s −1 . The RV fields were first smoothed by applying a spatial moving average in an area of 8.25 deg 2 (i.e., 5 × 5 grid points). Tracking is then performed (step II) by connecting the identified cyclonic centers at consecutive time steps. This process is done by starting from the cyclonic center that presents the highest value of RV and then searches backward and forward in time for all possible tracks within a 10° × 5° area in longitude and latitude. This process creates several candidate tracks associated with the same mature cyclonic state, i.e., the same track point of highest RV. The algorithm finally chooses the track that presents the least average difference of RV between consecutive track points, weighted by the distance between the track points. This tracking method is presented in detail in Flaounas et al. (2014).
Method 2-In this method the cyclone centers are first defined as local maxima of the RV field at 850 hPa (ζ 850 ) within a radius of 380 km, determined every 6 h from the model output. Before determining the maxima, a 9 grid point spatial filter is applied on the ζ 850 field, by weighting by the inverse of the distance from the central point. In a second step, trajectories are built by pairing the cyclone centers previously determined for consecutive time steps. Three criteria are used to connect the cyclone centers: (1) the first criterion takes into account the absolute vorticity field and its variation from the given cyclone center. If the variation between two cyclone centers is larger than 40 %, the two centers are considered to belong to two different cyclonic systems. (2) The second criterion uses the mean flow in the middle troposphere represented by wind at 700 hPa. From winds at 850 and 700 hPa levels, two trajectory positions are determined and compared. (3) The third criterion ensures the trajectory coherence by minimizing the acceleration, in order to avoid the abrupt changes in wind speed and direction of the movement. More insight into this method is given in Ayrault and Joly (2000).
Method 3-This method functions in two steps too. First the cyclone centers are identified in each time step for the whole period, then they are connected by the tracking algorithm. The identification is based on finding minima in the SLP field. The SLP minima are searched successively in 15 × 15 grid point regions, which approximately corresponds to the typical size of a cyclone. A minimum is only kept, if it is less than 1013.5 hPa. The tracking algorithm uses the nearest neighbour approach. It specifies a rectangular area as a search region to find the continuation of the cyclone track in the consecutive time step. The search area is positioned around the cyclone center in a way that it is more extended on the eastern side. In this way the mainly eastward propagation of the cyclones is taken into account. If the algorithm finds more centers in the search area, the nearest one will be selected as the continuation of the track. At the end of the tracking procedure the cyclones lasting less than one day are discard. This tracking method is presented in detail in Kelemen et al. (2015).
Method 4-This method (Lionello et al. 2002) involves the partitioning of the SLP gridded fields in depressions by the identification of sets of steepest descent paths leading to the same SLP minimum. All the points crossed by a path leading to the same minimum are assigned to the same depression. Small depressions whose central minimum is close to the boundary of a different and deeper depression are included in the latter (a minimum distance has to be adjusted according to the grid spacing). This procedure attributes every point in the grid to a cyclone, such that the whole domain is divided in partitions, one for each cyclone center. The position of the SLP minimum is associated to the average position of all points in its partition with SLP not more than 3 hPa higher than the actual minimum. A vicinity criterion is used for obtaining the trajectory associated with each cyclone by joining the location of low pressure centers in successive maps.
Method 5-This method identifies and tracks cyclones using a slightly refined variant of the method described in Wernli and Schwierz (2006). For every local minimum of the SLP field, isobars are considered at 0.5 hPa intervals between 920 and 1050 hPa and then the regions within closed isobars are accepted as cyclone regions. The maximum length of the enclosing contour is limited to 7500 km, to exclude unrealistically large features that typically occur with very flat SLP distributions. Where SLP contours reach the border of the considered model domain, the contours are continued along the domain border. If two local SLP minima occur within a distance of less than 300 km, they are merged and the system is regarded as a single multicenter cyclone. The tracking then tries to meaningfully connect the previously identified cyclone centers in time, as described in Wernli and Schwierz (2006) and with the following modifications: (1) no orography filter is used here because of the complex geography of the Mediterranean region (i.e., all SLP minima are considered as potential cyclone centers for the tracking), and (2) if a cyclone track cannot be continued for one or two time steps (e.g., because the cyclone transforms into an open wave structure without clouds SLP contours) then the track is continued in case that a reasonable candidate can be found after these time steps.
Method 6-This method comprises several steps. First, all relative minima of SLP are identified. A pressure gradient threshold is applied, such that only those lows exceeding a pressure gradient of 0.5 hPa/100 km along at least six of eight main directions (N, NE, E, SE, S, SW, W, NW) are retained. A size limitation is applied if two minima are within approximately 170 km, such that only the minimum with the largest circulation is taken. In order to obtain cyclone tracks, the wind at 700 hPa is taken as the steering wind for cyclones. A cyclone center is considered to follow another one in the previous time step if it is within an ellipse around the latter center and whose major axis is proportional to the speed of the steering wind. If several centers satisfy this, the one with a track direction closest to the 700 hPa wind direction is selected as the next track point. Further details on the method are described in Picornell et al. (2001).