1 Introduction

Episodes of large-scale warm temperature anomalies in the ocean may prompt substantial disruptions to marine ecosystems (Frölicher and Laufkötter 2018; Hobday et al. 2016) and major implications for fisheries as well (Mills et al. 2013). Known as marine heatwaves (MHW), these extreme events describe abrupt but prolonged periods of high sea surface temperatures (SST) (Scannell et al. 2016) that can occur anywhere, at any time, with the potential to propagate deeper to the water column (Schaeffer and Roughan 2017). They have received little attention until improved observational systems revealed adverse consequences emanating from them. Their occurrence is likely to intensify under continued anthropogenic warming (Frölicher et al. 2018; Oliver et al. 2018a), engendering the need for a more comprehensive examination of their spatiotemporal distribution and underlying physical causes.

In the Mediterranean area, a well-known“Hot Spot” region for climate change (Giorgi 2006), the annual mean basin SST by the end of the 21st century is expected to increase from + 1.5 °C to + 3 °C relative to present-day levels, depending on the greenhouse gas (GHG) emission scenario (Somot et al. 2006; Mariotti et al. 2015; Adloff et al. 2015). This significant rise in SST is expected to accelerate future MHW occurrence, in congruence with projections for GHG-induced heat stress intensification of 200–500% throughout the region (Diffenbaugh et al. 2007). The Mediterranean area’s sensitivity to increased GHG forcing is mainly attributed to a significant mean warming and increased interannual warm-season variability, along with a reduction in precipitation (Giorgi 2006). A recent study has already identified significant increases in MHWs globally over the last century, including the Mediterranean Sea (Oliver et al. 2018a).

In fact, one of the first-detected MHWs worldwide occurred in the Mediterranean in the summer of 2003: Surface anomalies of 2–3 \(\,{^{\circ }}\hbox {C}\) above climatological mean lasted for over a month due to significant increases in air-temperature and a reduction of wind stress and air-sea exchanges (Grazzini and Viterbo 2003; Sparnocchia et al. 2006; Olita et al. 2007). These factors seem to have also triggered an anomalous SST warming in the eastern Mediterranean area during the heatwave of 2007, at the order of + 5 °C above climatology (Mavrakis and Tsiros 2018). Since then, numerous studies have explored the modulating factors behind individual events around the world. For instance, a combination of local oceanic and large-scale atmospheric forcing was suggested for the Australian MHW of 2011 (Feng et al. 2013; Benthuysen et al. 2014) and the persistent, multi-year (2014–2016) “Pacific Blob” (Bond et al. 2015; Di Lorenzo and Mantua 2016). Other events have been attributed to mainly atmosphere-related drivers, such as the 2012 Atlantic MHW (Chen et al. 2014, 2015) and the extreme marine warming across Tropical Australia (Benthuysen et al. 2018), or to ocean-dominating forcing like the 2015/2016 Tasman Sea MHW (Oliver et al. 2017). The importance of regional influences was further noted in coastal MHWs in South Africa (Schlegel et al. 2017a) and during subsurface MHW intensification around Australia (Schaeffer and Roughan 2017).

As a result of these events, severe impacts on marine ecosystems have been documented worldwide, including biodiversity die-offs and tropicalisation of marine communities (Wernberg et al. 2013, 2016), extensive species migrations (Mills et al. 2013), strandings of marine mammals and seabirds, toxic algal blooms (Cavole et al. 2016) and extensive coral bleaching (Hughes et al. 2017). In the Mediterranean Sea in particular, unprecedented mass mortality events and changes in community composition due to extreme warming were reported in the summers of 1999 (Perez et al. 2000; Cerrano et al. 2000; Garrabou et al. 2001; Linares et al. 2005), 2003 (Garrabou et al. 2009; Schiaparelli et al. 2007; Diaz-Almela et al. 2007; Munari 2011), 2006 (Kersting et al. 2013; Marba and Duarte 2010) and 2008 (Huete-Stauffer et al. 2011; Cebrian et al. 2011), affecting a wide variety of species and taxa (e.g. 80 % of Gorgonian fan colonies and seagrass Posidonia oceanica). MHWs can be especially lethal for organisms with reduced mobility that are usually limited to the upper water column; Their severity is determined by both temperature and duration (Galli et al. 2017). Finally, cascading effects have also been observed in fisheries, resulting in huge financial losses and even economic tensions between nations (Mills et al. 2013; Cavole et al. 2016; Oliver et al. 2017).

However, despite the growing body of MHW-related literature, systematic examination of MHWs as distinct exceptional events with intensity, frequency and duration has only just emerged. Although marine extremes have been investigated before, only a few studies have analysed past trends in extreme ocean temperatures (e.g. Scannell et al. 2016; MacKenzie and Schiedek 2007) and even fewer have dealt with their future evolution. For instance, past trends of extreme SST have been investigated in coastal regions (Lima and Wethey 2012) and through thermal-stress-related coral bleaching records (Lough 2000; Selig et al. 2010; Hughes et al. 2018). Using a more standardised framework, past MHW occurrences have been studied in the Tasman Sea (Oliver et al. 2018b) and the global ocean (Oliver et al. 2018a). For the 21st century, MHW projections have been performed so far on a global scale, with the use of multi-model setups from CMIP5 (Frölicher et al. 2018) and CMIP3 (Hobday and Pecl 2014) and under different GHG emission scenarios. On a regional scale though, ocean extremes have been assessed in Australia (King et al. 2017) and the Tasman Sea (Oliver et al. 2014).

The above-mentioned studies used different definitions for extreme warm temperatures, with some adopting a recent standardised MHW approach proposed by Hobday et al. (2016). The set of dedicated statistical metrics developed in this framework allows for a consistent definition and quantification of the MHW properties. A MHW is now described as a “discrete, prolonged, anomalously warm water event at a particular location” . Using this definition, Schlegel et al. (2017b), for example, identified an increase in MHW frequency around South Africa for the period 1982–2015, while Schaeffer and Roughan (2017) demonstrated subsurface intensification of MHWs in coastal SE Australia between 1953–2016. A linear classification scheme was also proposed by Hobday et al. (2018), where MHWs are defined based on temperature exceedance from local climatology.

In the case of the Mediterranean Sea, however, little is known about past or future MHW trends and their underlying mechanisms. The MHW-related research has mostly been focused on local ecological impacts without systematically assessing MHW occurrence. According to Rivetti et al. (2014) and Coma et al. (2009), most of the mass mortalities documented in the basin were related to positive thermal anomalies in the water column that occurred regionally during the summer. Although they have been reported with increased frequency since the early 1990s, their occurrence has been observed as early as the 1980s. Meanwhile, the evolution of extreme Mediterranean SST in the 21st century has so far been examined in relation to the thermotolerance responses of certain species. For instance, Jordà et al. (2012) used an ensemble of models under the moderately optimistic scenario for GHG emissions A1B and suggested an increased seagrass mortality in the future around the Balearic islands due to a projected rise of the annual maximum SST by 2100. Similarly, Bensoussan et al. (2013) evaluated the thermal-stress related risk of mass mortality in Mediterranean benthic ecosystems for the 21st century, based on the average warming estimated between 2090–2099 and 2000–2010, under the pessimistic future warming scenario A2. Finally, Galli et al. (2017) showed an increase in MHW frequency, severity and depth extension in the basin, assuming exceedances from species-specific thermotolerance thresholds under the high-emission IPCC RCP8.5 scenario. [The A1B and A2 emission scenarios correspond to projections of a likely, mean temperature change of 1.7–4.4 °C and 2.0–5.4 °C respectively by the end of the 21st century (IPCC 2007), whereas RCP2.6, RCP4.5 and RCP8.5 to a likely change of 0.3–1.7 °C, 1.1–2.6 °and 2.6–4.8 °respectively, by the end of the period (Kirtman et al. 2013)].

In addition, our understanding of the Mediterranean Sea’s response to future climate change to date mostly relies on ensembles of low resolution GCMs (CMIP5) (e.g. Jordà et al. 2012; Mariotti et al. 2015) or on numerical experiments carried out with a single regional ocean model under different emission scenarios (i.e Somot et al. 2006; Bensoussan et al. 2013; Adloff et al. 2015; Galli et al. 2017). Consequently, the various sources of uncertainty related to the choice of the socio-economic scenario, choice of climate model and natural variability have not been properly taken into account by climate change impact studies on Mediterranean Sea ecosystems and maritime activities. Since there is an evident link between distinctive climate anomalies and notable ecosystem effects (e.g in the Mediterranean Sea, Crisci et al. 2011; Bensoussan et al. 2010), it is important to adress these uncertainties by considering different possible climate futures through multi-model, multi-scenario set ups when possible.

In this context, the aim of this study is to provide a robust assessment of the future evolution of summer MHWs in the Mediterranean Sea using an ensemble of high-resolution coupled regional climate system models (RCSM), driven by GCMs and a multi-scenario approach (RCP2.6, RCP4.5, RCP8.5). The RCSM’s ability to reproduce Mediterranean SST features is first evaluated against satellite data. Then, a MHW spatiotemporal definition, based on SST and on Hobday et al. (2016)’s recommendations, is developed and applied to study the response of extreme thermal events to future climate change. For the first time, changes in summer Mediterranean MHW frequency, duration, intensity and severity are investigated with respect to an envelope of possible futures.

This paper is organised as follows: in Sect. 2 we present the ensemble of RCSMs along with the methodology proposed for the detection and characterisation of summer MHWs. Model evaluation against observed mean and extreme SST is performed in Sect. 3, using daily SST data. We also describe the future evolution of Mediterranean SST and MHW properties under different greenhouse gas emission scenarios from 1976–2100. A discussion and summary of the results are presented in Sects. 4 and 5.

2 Material and methods

2.1 Model data and simulations

An ensemble of six coupled RCSMs (CNRM-RCSM4, LMDZ-MED, COSMOMED, ROM, EBU-POM, PROTHEUS) with different Mediterranean configurations is employed in this study. Participant members are provided by six research institutes from the Med-CORDEX initiative [Ruti et al. (2016), https://www.medcordex.eu/] and each simulation will be herein referred to by the name of the corresponding institute, as mentioned in Table 1 (e.g. simulations with the CNRM-RCSM4 model will be referred as CNRM, etc.). Med-CORDEX can be considered as a multi-model follow-up to the CIRCE project (Gualdi et al. 2013), which studied the Mediterranean Sea under a single scenario (A1B) with most of the simulations stopped in 2050.

One novel aspect of the Med-CORDEX ensemble is that all models have a high-resolution oceanic (eddy-resolving) and atmospheric component as well as high coupling frequency (see Table 1). The free air-sea exchanges offered by their high-resolution interface is also an advantage for the MHW representation, which depends on ocean-atmosphere interactions. The domains cover the entire Mediterranean and a small part of the Atlantic, while the Black Sea and Nile river are respectively parametrized or represented with climatologies (except for AWI/GERICS, in which the oceanic component is global and explictly simulates the Black Sea). Boundary conditions come from 4 different general circulation models of CMIP5. Information about each coupled system is summarised in Table 1. To avoid biasing results towards one or more members of the ensemble, only the realization with the highest resolution is selected for each model.

All the numerical simulations produced daily SST data (3D temperatures were stored at a monthly scale) between 1950–2005 for the historical experiment (HIST) and for 2006-2100 under the Representative Concentration Pathway RCP8.5 (high-emission scenario), RCP4.5 (moderate-emission scenario), RCP26 (low-emission scenario) IPCC scenarios. As the models use boundary conditions from CMIP5, which are not in phase with the observed variability, simulation chronology does not represent the actual conditions that correspond to each calendar year. Instead, they are expected to represent the climate statistics of each period (e.g. average, standard deviation) well. We use SST instead of deeper layer temperatures, as both the models’ behaviour and the MHW identification technique can be evaluated at a larger scale using satellite data. A total of 17 simulations were used from six models with variable resolution (Table 1). For the purposes of our analysis, we define 30-year periods from the HIST run between 1976–2005 (from this moment on referred as HIST), the near future (2021–2050) and the far future (2071–2100).

In the case of ENEA, the HIST run span from 1979–2005 due to different simulation initialization, while CMCC and AWI/GERICS simulations reached 2099. The spin-up strategy of the Med-CORDEX ensemble was not prescribed, therefore it was different for every configuration. The lack of a long spin-up (e.g. U.BELGRAD, ENEA) could be detrimental for temperatures at deeper layers but not so relevant for the SST evolution. For the CNRM model, a constant monthly flux (atmosphere to ocean) correction was applied to minimise identified biases, with no significant influence on the climate change signal. Also, a slightly intense SST signal in the Alboran Sea was noted in the U.BELGRADE configuration for 2021–2050 under RCP8.5 and is probably linked to the simple representation of the connection between the Mediterranean Sea and the Atlantic Ocean in the model: the open boundary condition, as defined in the POM model, was applied in single model point defined on the strait of Gibraltar, without any buffer zone and with prescribed boundary conditions in the Atlantic Ocean; this is, on the other hand, a common approach in many models. Finally, an error has been recently reported concerning the CNRM-CM5 GCM files that were used as atmospheric lateral boundary conditions for CNRM and ENEA (http://www.umr-cnrm.fr/cmip5/spip.php?article24), but this likely has no significant effect on the long-term climate change signal.

Working from the hypothesis that MHWs are usually confined close to the surface, in this study we consider that the model SST data of the 1st layer depth represent surface temperatures between 1-16 m, depending on the model. We acknowledge, however, that MHWs may penetrate deeper to the water column under certain conditions, but assume for the time being that SST is a reliable sign of possible harmful conditions for deeper layers.

Table 1 Characteristics of the Med-CORDEX coupled regional climate system models (RCSM) and the simulations used in this study

2.2 Reference dataset

In order to evaluate the model’s capability to simulate trends in regional extreme thermal events, we first perform comparisons with satellite data (OBS) provided by the Copernicus Marine Service and CNR - ISAC ROME. More specifically, the Mediterranean Sea high-resolution L4 dataset is employed, providing daily, reprocessed SSTs on a 0.04 °grid, an interpolation of remotely sensed SSTs from the Advanced Very High Resolution Radiometer (AVHRR) Pathfinder Version 5.2 (PFV52) onto a regular grid (Pisano et al. 2016). They are obtained over a 30-year period of January 1982 to December 2012 and are used as a reference for the models’ performance in the mean and extreme climate in the Mediterranean Sea. With the aim of validating the “present-day” climate, we choose the 30-year period (1976–2005) in the model HIST runs that has the greatest overlap with the observed 30-year period (1982–2012). Prior to performing any calculations and in order to compare the results between the models and observations, we first interpolated every dataset to the NEMOMED8 grid, already in use by 2 RCSMs, by implementing the nearest neighbour method.

2.3 Defining marine heatwaves

As for their atmospheric counterparts, there is no universal definition for MHWs. However, certain metrics can be applied to compare different events in space and time. In this research, the qualitative MHW definition proposed by Hobday et al. (2016) is followed. We use it as a baseline for developing a quantitative method that will identify MHWs, namely in the summer months, based on the climatology and the geographical characteristics of the area. Although we recognise that heatwaves in colder months might also be essential for certain species, we choose to focus on extreme events related to the highest annual SSTs, when organisms may be beyond their optima, as seen by previous mass mortality events in the Mediterranean (e.g. MHW of 1999 and 2003).

According to Hobday et al. (2016) a MHW is a “prolonged, anomalously warm water event at a particular location” and it should be defined relative to a 30-year period. In our case, a subset of the HIST experiments (1976–2005) and the 1982–2012 period for the observations are chosen, representing the average climate in the latter half of the 20th century. In order to achieve a homogenised yet area-specific temperature diagnostic, for every year of the reference period (HIST) we first compute the 99th quantile of daily SST (\(SST_{99Q}\)) for every grid point. Then we average these 30 years of extreme values, constructing a 2D threshold map. Note that individual threshold maps were created for each dataset separately, accounting for the different model characteristics (e.g SST bias). An “anomalously warm day” at every grid point is then any given day when the local \(SST_{99Q}\) threshold is exceeded. However, in order to be classified as a “prolonged” event, we set the minimum duration of a MHW to 5 days, following Hobday et al. (2016). Further, we aim to identify long-lasting events, since most of the previous mass mortalities in the basin occurred during thermal anomalies that lasted for more than 5 days (e.g. (Garrabou et al. 2009; Di Camillo and Cerrano 2015; Cerrano et al. 2000; Cebrian et al. 2011). In addition, the average present-day MHW duration in the basin was found around 10 days (not shown). Therefore, a 3-day or 7-day minimum definition threshold would not change significantly the MHW characteristics in the future (see Sect. 4.)

The discrete nature of MHWs also necessitates a well-defined starting and ending day, but gaps with temperatures close to threshold values can also be found, as a result of day-to-day SST fluctuations. At this point, our definition differs slightly from that of Hobday et al. (2016). More specifically, gaps of up to 4 consecutive days or less are allowed inside a local MHW (considered as warm days). This is true,however, only when both the preceeding and following 6-day mean SST of a gap day (including the gap day in each mean) are above the local \(\hbox {SST}_{\mathrm{99Q}}\). For the cool day “neighbourhood” this would represent a tendency to remain above threshold, even though the SST of that particular cool day might be below limit. This also reflects the fact that minor SST deviations from the threshold cannot impact the overall warm conditions of a MHW. It would most likely not offer either an “essential” relief to organisms, even to the less mobile and perhaps less tolerant species, once a MHW has started. Taking advantage of the default statistical sensitivity of the mean to outliers (in this case cold temperatures), we make the assumption that an event with the potential to interrupt a MHW (e.g wind, current) should cause a considerable drop in daily SST. Therefore, a below-threshold drop in either of the 6-day SST averages would not allow any cool day to merge with a MHW, in the same way that a sequence of five cool days or more would interrupt an event entirely. The 11-day window around the gap day is chosen since the minimum duration of a MHW was set to five days.

The spatial coverage of the MHW is then determined by aggregating grid points that are “activated” in a MHW state every day but are not necessarily contiguous. In common with many atmospheric definitions, a minimum 20% of the Mediterranean surface in \(\hbox {km}^2\) was chosen in order to detect large-scale events that may have a broad ecosystem impact but also represent rare occurrence for the average climate conditions of HIST period. We, therefore, opt for prolonged, large-scale and extremely warm ocean temperatures that do not occur on a yearly basis in the 20th century, with a view of quantifying their evolution in the 21st century under different GHG emission scenarios. The advantage of a percentile-based SST threshold in our case is that spatial patterns are also identified independently from the different extreme temperature levels that characterise sub-basins in the Mediterranean. We acknowledge that the detection method is developed based on subjective choices, and the sensitivity of the climate change results to these changes was also tested (See Sect. 4).

Once a MHW is identified, a subset of MHW metrics defined in Hobday et al. (2016) are used to characterise it. We examine the frequency of MHWs (Annual count of events), and the duration of each event is defined as the time between the first (\(t_{s}\)) and last day (\(t_{e}\)) for which a minimum of 20% of Mediterranean Sea surface is touched by a MHW. Every event is characterised by a mean and max intensity (mean and spatiotemporal maximum temperature anomaly relative to the threshold over the event duration) and a maximum surface coverage. Finally, its severity is represented by cumulative intensity (spatiotemporal sum of daily temperature anomalies relative to the threshold over the event duration) (Fig. 1, Table 2)

Fig. 1
figure 1

Schematic of a MHW based on Hobday et al. (2016). The black line represents daily SST variations of one grid point in a random year and red line is the local threshold (\(\hbox {SST}_{\mathrm{99Q}}\)) based on the 30-year average of yearly 99th quantile of daily SST for that point. The blue line is the daily 30-year climatology for this point. Also shown here also are the starting day (\(\hbox {t}_{\mathrm{s}}\)) and ending day (\(\hbox {t}_{\mathrm{e}}\)) above \(\hbox {SST}_{\mathrm{99Q}}\), gap days and the different measures of daily intensity. MHW metrics refer to the total event duration

Table 2 Marine heatwave (MHW) set of properties and their description after Hobday et al. (2016)

3 Results

3.1 Model evaluation

The first goal of this paper is to evaluate each models’ ability to simulate mean (\({\overline{SST}} \)) and extreme Mediterranean Sea SST (\(\hbox {SST}_{\mathrm{99Q}}\)) correctly. For this reason, pattern correlations were first performed, using the Pearson product-moment coefficient of linear correlation between two variables. The observed annual mean \({\overline{SST}} \) between 1982–2012 (Fig. 2, OBS) shows a NW-SE pattern of cold-warm temperatures ranging from \(\sim 15\) °C to 23 °C, respectively. Similarly, all the models demonstrate a warmer Eastern Mediterranean (EM) between 19 and 23 °C while colder deep water formation areas (e.g. Gulf of Lions, Adriatic) are captured well around 15–17 °C. Despite a multi-model mean (MMM) cold bias of about 0.6 °C, spatial correlations between each model mean 1976-2005 \({\overline{SST}} \) and observations are high (MMM \(\sim 0.94\)). The lowest bias is found in ENEA and the highest in the CMCC and AWI/GERICS models (see Table 3). Note that satellite provides skin and night-time SST values, whereas the model SST represents averaged daily temperatures of the first few meters of mixed layer depth. Part of the model bias can be therefore explained by this difference in SST.

Fig. 2
figure 2

Yearly \({\overline{SST}} \) (°C) for the HIST run of every model (1976–2005) and satellite data during 1982–2012. Note that the HIST run for ENEA is from 1979–2005

More complex spatial patterns are revealed when examining the 2D threshold maps used as the basis for defining Mediterranean MHWs (Fig. 3). The highest \(\hbox {SST}_{\mathrm{99Q}}\) are observed in Central Ionian, Gulf of Gabes, Tyrrhenian Sea and Levantine basin varying from approximately 27–31 °C and the lowest (20–22 °C) in deep water formation areas and the Alboran Sea (Fig. 3 OBS). In general, all the models are able to reproduce these patterns, although this time they share lower spatial correlations with the observations (MMM \(\sim 0.78\)). The ENEA model shows a warm bias whereas CMCC, U.BELGRAD and AWI/GERICS show a cold bias larger than 1 °C. The similar behaviour of the latter three could perhaps be related to the common atmospheric component (ECHAM) of their driving GCM. On the whole, the difference between the MMM mean and the extreme basinwide SST is found to be \(\sim 6.6\) °C, in good agreement with the observations (7.1 °C). The corresponding MMM spread is small for the \({\overline{SST}}\) but higher for the \(\hbox {SST}_{\mathrm{99Q}}\) (\(\sim 1\) °C).

Fig. 3
figure 3

Individual MHW threshold maps of mean \(\hbox {SST}_{\mathrm{99Q}}\) (°C) computed from the HIST run of every model (1976–2005) and satellite data during 1982–2012. Note that the HIST run for ENEA is from 1979–2005

Further, the domain-averaged timeseries of SST illustrate a warming tendency of both the \({\overline{SST}} \) and annual \(\hbox {SST}_{\mathrm{99Q}}\) (Fig. 4; Table 3). Even though MMM \({\overline{SST}} \) and \(\hbox {SST}_{\mathrm{99Q}}\) obtain similar trends (\(\sim 0.02\) °C/year) they seem to underestimate the corresponding observed trends (0.04/0.05 °C/year). Particularly for AWI/GERICS and U.BELGRAD, this is a response likely explained by their common driving GCM (MPI-ESM-LR). On the other hand, the amplitude of interannual variability is found similar to the observations for most of the models. On the whole, the observed and most of the model trends are statistically significant at a level of 95% except for certain cases indicated in Table 3. Interestingly, none of the simulations peaked as high as the observations during the exceptional MHW year of 2003 (20.4 °C for \({\overline{SST}} \) and 28.4 °C for \(\hbox {SST}_{\mathrm{99Q}}\)). This record basinwide \(\hbox {SST}_{\mathrm{99Q}}\) value is on average 8.7 °C higher than the average \({\overline{SST}} \) °C of 1982–2012 and 2.8 °C greater than the basin-mean \(\hbox {SST}_{\mathrm{99Q}}\) of that period.

Fig. 4
figure 4

Timeseries of area-averaged, yearly \({\overline{SST}} \) °C (left) and \(\hbox {SST}_{\mathrm{99Q}}\) °C (right), during HIST for every model and satellite data, represented by a solid line. Trends are indicated in dashed lines. The different simulations are represented by different colors

In terms of MHW properties during 1982–2012 (Table 3), observed MHW frequency is found at 0.8 events per year that last a maximum of 1.5 months and range between July and September. The mean intensity of MHWs varies from 0.3 to 0.9 °C, covering a maximum of 20–90% of the Mediterranean Sea surface, with a maximum intensity of 5.0 °C (2002) and a maximum severity of \(8.5 \times 10^{7}\) °\(\hbox {C~days~km}^2\). The highest values over this period (except from Imax) refer to the characteristics of the well-known MHW 2003. More specifically, they correspond to a Mediterranean-scale event lasting 48 days (20 July–5 September) by our definition, in line with Grazzini and Viterbo (2003) and Sparnocchia et al. (2006). It seems that mainly the phase of the MHW that was both large-scale and intense was captured here.

On average, the simulated events during HIST are well within the equivalent observed range of every variable. They manifest though a slightly lower annual frequency, a potential for slightly higher maximum durations and starting dates up to early September. They also appear to underestimate the upper level of the Imean, Imax and severity range. In particular, event durations of two months or more are exhibited by LMD, CMCC and AWI/GERICS models, while the ENEA model shows the highest Imax of 5.3 °C. Maximum severity, on the other hand, appears closer to the observed values only in the LMD and CMCC models. These configurations also show a MHW maximum spatial coverage above 80%, along with CNRM and AWI/GERICS. In general, the Med-CORDEX ensemble appears to perform well given that this is the first time, to our knowledge, that Mediterranean RCSMs have been evaluated for MHWs properties.

To better understand the ensemble variability of the MHW characteristics in the HIST period, we also combine Intensity-Duration-Frequency (IDF) information for every dataset separately (see Fig. 5). The total number of events of this period are organised in bins of Imean (every 0.02 °C) and duration (5-day bins progressively increased to 10-day and 20-day bins). Although some models simulate longer events relative to the observed MHW 2003, only CMCC exhibited equivalent MHWs in terms of duration and intensity. At the same time, 1–3 events are detected for most classes of Imean and duration, in both the observations and the models. There are only a few cases where 3–7 events appear with Imean below 0.6 °C but follow no specific duration pattern.

Fig. 5
figure 5

IDF plot; Intensity (Imean in °C), Duration (Days), Frequency (Number of MHW during 1976–2005). Imean is organised in bins of 0.02 °C while duration is in bins of 5, 10 and 20 days. Red box indicates observed characteristics corresponding to the exceptional MHW of 2003

Table 3 Evaluation of SST and MHW properties during HIST run

3.2 Future Mediterranean SST evolution

In this section we analyze projections of \({\overline{SST}} \) and \(\hbox {SST}_{\mathrm{99Q}}\) in the 21st century by comparing their evolution against the reference period and under different GHG emission scenarios.

During 2021–2050 an increase is found for the domain-averaged ensemble mean \({\overline{SST}} \) and \(\hbox {SST}_{\mathrm{99Q}}\) with respect to HIST, around 0.8–1 °C and 1–1.2 °C respectively. While the mid-21st century anomalies appear almost independent from the greenhouse gas forcing, a more diverse and substantial warming occurs towards 2071–2100 (see Fig. 6). In particular, the multi-model mean \({\overline{SST}} \) and \(\hbox {SST}_{\mathrm{99Q}}\) anomalies under RCP8.5 are 3.1 °C and 3.6 °C respectively, exhibiting nearly a doubling of their corresponding RCP4.5 rise. Similarly, the equivalent increase of CNRM \({\overline{SST}} \) and \(\hbox {SST}_{\mathrm{99Q}}\) under RCP8.5 is about 3 times as high as that under RCP2.6 for the same period. Individually, however, the highest mean and extreme SST anomalies are demonstrated by the LMD and CMCC models under every scenario and for every period (see Table 4). For both SST indices, the effects of the different emission scenarios become more evident by 2060, with the highest/intermediate warming occurring for every model under RCP8.5/RCP4.5 and the lowest under the (mono-model) RCP2.6 simulation. In the latter, little or no difference is found between the \({\overline{SST}} \) and \(\hbox {SST}_{\mathrm{99Q}}\) rise throughout the century. In contrast, under RCP4.5 and RCP8.5, the multi-model \(\hbox {SST}_{\mathrm{99Q}}\) increase appears greater than the \({\overline{SST}} \) rise by 20–25% during 2021–2050 and by 16–18% for 2071–2100 (see Table 4 and discussion section). This implies a higher contribution from \({\overline{SST}} \) to the warming towards the end of the century.

The spatial distribution of the corresponding anomalies, however, appears inhomogeneous. For 2071–2100, some regions in the Levantine basin, Balearic islands, Tyrrenian Sea, Ionian Sea and North Adriatic Sea exhibit the highest MMM \({\overline{SST}} \) anomalies in every scenario (Fig. 7). In contrast, the lowest anomalies of that time are located in the Alboran Sea, where cold waters are advected from the Atlantic, and depending on the scenario they may range from \(\sim 0.6\) °C (RCP2.6) to \(\sim 2.4\) °C (RCP8.5).

Fig. 6
figure 6

Area-average, yearly \({\overline{SST}} \) °C (left) and extreme \(SST_{99Q}\)°C (right) anomalies with respect to HIST. Bold colors represent the multi-model average and lighter colors are the individual simulations. RCP2.6 scenario has only one simulation (CNRM), HIST run is illustrated in grey and observations in dashed black

Meanwhile, the most pronounced extreme warm anomalies (\(\hbox {SST}_{\mathrm{99Q}}\)) for 2071–2100 under RCP4.5 and RCP8.5 are projected for the NW mediterranean, Tyrrenian Sea, Ionian Sea and some parts of North Levantine basin (Fig. 8). Under RCP2.6 though, the greatest \(\hbox {SST}_{\mathrm{99Q}}\) anomalies (> 1.2 °C) are more confined towards the Aegean Sea, Adriatic, Tyrrhenian Sea and the area around Balearic islands. In addition to the highest \(\hbox {SST}_{\mathrm{99Q}}\) rise, the Adriatic Sea, Ionian Sea, Tyrrhenian Sea, some parts around the Balearic islands and the North Levantine basin display also the greatest \({\overline{SST}} \) rise, for every scenario during the 2nd half of the 21st century. During 2021–2050, however, they exhibit the highest mean and extreme warming under RCP26 and RCP4.5 but not under RCP8.5. The Alboran Sea and the SE Levantine basin, on the other hand, demonstrate the lowest \(\hbox {SST}_{\mathrm{99Q}}\) anomalies in every period and every scenario

Table 4 Future Mediterranean-averaged, yearly mean (\({\overline{SST}} \)) and extreme (\(SST_{99Q}\)) anomalies (with respect to HIST) for the near and far future under different emission scenarios

3.3 Future evolution of Mediterranean MHWs

The MHW climate change response is examined here using anomalies. These anomalies are computed for the average MHW characteristics in the future relative to the average MHW characteristics in HIST run, for each sub-period, model and scenario (Table 5).

The multi-model mean reveals an increase in frequency of 0.3–0.4 events/year for every period of RCP4.5/RCP8.5 with the mono-model RCP2.6 simulation showing a slightly greater increase of 0.5–0.7 events/year. In fact, individual simulations of Fig. 9 suggest a shift from a period where years without MHWs were common (1976–2030) to a period with at least one long-lasting MHW every year. More specifically, towards 2071–2100, events can start as early as June and finish as late as October under RCP8.5, whereas for RCP4.5 and RCP2.6, the MHW temporal extent appears between July–September (Fig. 10). It is clear that the higher the radiative forcing, the broader the window of occurrence. For example, MHWs during 2071–2100 may last on average 3 months longer in RCP8.5 than HIST MHWs (\(\sim 21.8\) days, not shown) but almost 2 months longer in RCP4.5 (see Table 5). This is a MMM increase in the duration, almost double the corresponding increase during 2021–2050 under RCP4.5 (\(\sim 30.9\) days) and more than double that under RCP8.5 (\(\sim 39.2\) days). Even under the optimistic RCP2.6 scenario, MHWs by 2050 may be 17.2 days longer than today and may become 1 month longer at maximum by 2100.

Long-term projections show analogous changes in the Imean of future MHWs. They are examined through IDF plots that display the total number of MHWs identified by the ensemble during HIST (1976–2005) run, near and far future (Fig. 11). To avoid imbalances in the present-future comparisons arising from the different sets of models for RCP4.5 and RCP8.5 (see Table 5), all the simulated future events are pooled for every period and juxtaposed against the corresponding sets of HIST events. Therefore, we show 3 HIST IDF plots, one for each scenario. As previously demonstrated the stronger the emission scenario, the longer the duration and the higher the Imean of the events. The MMM Imean response appears small during 2021–2050 (+ 0.1°C to + 0.3 °C depending on the scenario) but increases towards the end of the period with higher radiative forcing. For instance, MHWs show durations of up to 170 days (Fig. 11) in the far future of RCP8.5 and Imean of 1.8 °C on average (not shown). For the CNRM model though and under RCP2.6, the corresponding response towards 2071–2100 has doubled compared to the mid-21st century, while it becomes 4.5 times higher under RCP8.5 (see Table 5). Longer-lasting MHWs at the end of the period for RCP4.5 and RCP8.5 explain the lower frequency of occurrences compared to RCP2.6. A similar behaviour to Imean displays the MMM average response of Imax, with the highest anomalies indicated towards 2071–2100 (up to 3.7 °C), whereas during the mid-21st century they range between 0.5 °C (RCP2.6) and 1.2 °C(RCP8.5).

It should be also noted that for RCP4.5 and RCP8.5, events with characteristics similar to the observed exceptional MHW 2003 (Fig. 11, red box) seem to become the new standard over 2021–2050 and even constitute weak occurrences for the distant future of RCP8.5. In the more optimistic RCP2.6, MHWs appear more frequent during 2021–2050 but their number is slightly decreased towards 2071–2100. Their characteristics, however, sustain a lower increase throughout the period compared to RCP4.5 and RCP8.5. For example, the response in duration and Imean is found close to that projected for CNRM during 2021–2050 and under RCP4.5 and RCP8.5 (see Table 5). Therefore, the possibility for an event like the MHW 2003 to occur regularly still features in a scenario close to the Paris Agreement (RCP2.6) .

Yet, the range of the uncertainty in future projections evolves not only in time but also throughout the different models. The severity (Icum) distribution of future MHWs was determined in that sense using Whisker diagrams. In these box plots, a specific Icum index is appointed at each simulated event of every dataset for each period and scenario (see Fig. 12, left). By definition, Icum translates the total spatiotemporal MHW impact into numbers. It features an exponential increase from HIST towards the end of the century from \(\sim 1 \times 10^7\,\hbox {C~days~km}^2\) to about \(\sim 50 \times 10 ^7\,\hbox {C~days~km}^2\) for RCP8.5 (2071–2100-not shown). Moreover, the higher the emission forcing, the higher the rate the ensemble mean Icum response escalates from its mid- to end-of-century values; for example, Icum varies from 5 to \(33 \times 10^{7}\, \hbox {C~days~km}^2\) in RCP4.5 and from 6.8 to \(49.1 \times 10^7\, \hbox {C~days~km}^2\) in RCP8.5 (see Table 5). This becomes more evident when comparing the equivalent CNRM severity response under RCP2.6 (2.4–3.6 \( \times \)\(10^7\,\hbox {C~days~km}^2\)) with the significantly higher response under RCP4.5 and RCP8.5. Although all configurations indicate an abrupt escalation through time, there appears to be a family of models (CMCC and LMD) that share a stronger climate change response. Those models exhibit higher changes in Icum, along with higher Imean, Imax, and duration values than the remaining models (see also Table 5 and Sect. 4).

The identified families of MHWs are also associated with a maximum spatial coverage illustrated through box plots in Fig. 12. It is estimated that events may affect a maximum of 40% of the Mediterranean Sea, on average, during HIST but may impact almost 100% of the basin by 2071–2100 under RCP8.5. Notwithstanding the large variability found for the mid-21st century, by 2100 the simulated maximum MHW extent seems to be an unanimous projection from every model and under RCP8.5. Conversely, MHWs under RCP2.6 increase their maximum coverage throughout the period, but towards 2071–2100 events may impact, on average, a maximum of 70% of the Mediterranean Sea.

Table 5 Future response (anomalies with respect to HIST) of MHW mean properties for the 6 RCSMs under RCP8.5, RCP4.5 and RCP2.6, for the near (2021-2050) and far future (2071–2100).

4 Discussion

4.1 MHW detection method

Several sensitivity tests performed on the MHW detection algorithm using only the CNRM model indicate low levels of uncertainty associated with small perturbations on the initial definition. For example, definitions with a different number of gap days, different minimum duration or minimum MHW spatial extent allowed (e.g. 10%) were tested but did not seem to change significantly the response of future MHW characteristics with respect to HIST run (see Supplementary Table 1S). The use of different quantile thresholds also showed that climate change response of duration, Imean and Imax with respect to HIST does not differ significantly if a lower/higher threshold than the \(\hbox {SST}_{\mathrm{99Q}}\) is chosen. However, the severity and maximum spatial coverage appear more sensitive to such changes (see Supplementary Table 1S).

However, certain limitations exist: assuming no spatial connectivity, the detection algorithm provides identification of large–scale (>20%) and long–lasting events but does not consider MHW effects during colder months or spatially smaller events. While it describes surface MHWs in the summer, it can be also applied to deeper layers and/or winter season when availability of data allows it.

4.2 Model-observation discrepancies

The discrepancies on mean and extreme Mediterranean Sea temperatures with respect to observations on the models were also evaluated using a shorter but common reference period of 1982-2005 for both datasets. Values of \({\overline{SST}} \), \(SST_{99Q}\), their trends and pattern correlations did not change considerably. However, the multi-model mean bias was slightly reduced by 28% for \({\overline{SST}} \) and by 31% for \(SST_{99Q}\) (see Supplementary material Table 2). Moreover, MHW identification appeared consistent despite the different SST layer depth of the observations (\(\sim \) mm) and the models (\(\sim \) m).

4.3 Model uncertainty

By default, the estimate of the uncertainty is given by the variation of the results across the ensemble members in an opportunistic way (Knutti et al. 2010). Although the models we use have a high-resolution representation of the air-sea interactions, uncertainties are introduced due to their individual biases but also due to the small number of the currently available Med-CORDEX simulations (6 RCSMs). To this purpose, more runs will be added in the future as part of the Med-CORDEX initiative. Despite this limitation, the RCSM ensemble seems to explore well the spread of SST anomalies predicted by earlier studies based on GCMs. For example, for RCP4.5 we estimate annual area-average \({\overline{SST}} \) anomalies from 2006–2100 with respect to HIST from approximately 0.7 to 2.6 °C, depending on the model (Fig. 6, left). This covers a large part of the corresponding anomalies found by Mariotti et al. (2015) for 2006–2100 with respect to 1980–2005 mean, which were between 0.5 °C to 3.5 °C for the CMIP5 ensemble of GCMs under the RCP4.5 scenario. Although our ensemble appears to underestimate the upper limit of this CMIP5 range, this could also reflect a better representation of the Mediterranean Sea dynamics by the regional models. Indeed, at higher resolutions the representation of air-sea interaction also improves (e.g. Akhtar et al. 2018; Roberts et al. 2016; Hewitt et al. 2017). At the same time, our results indicate an intensification of MHWs in the Mediterranean Sea with time, in agreement with the results obtained by Oliver et al. (2018a) and Frölicher et al. (2018), which used different MHW definitions.

Albeit some models have demonstrated lower/higher biases than others, we have chosen not to discard any of the configurations since their weak performance in some indices is not related to any specific behaviour of MHW indices in scenario. This choice also favours the holistic presentation of the uncertainty spectrum, without a considerable impact on the climate change response. More specifically, closer examination of the \({\overline{SST}} \) and \(SST_{99Q}\) bias effect on the anomalies of the average MHW characteristics in RCP8.5 and RCP4.5 with respect to HIST suggested no particular tendency or outliers affecting the range of the outcome, for any of the periods and scenarios (Supplementary material Fig. 1S, Fig.2S). It is however notable that LMD and CMCC have a tendency to show stronger responses in MHW or SST values. This could be due to the driving GCMs (IPSL-CMA5-MR and CMCC-CM), which demonstrate a higher mean surface temperature change over Europe by 2080 compared to CNRM-CM5 and MPI-ESM-LR, according to McSweeney et al. (2015). In that study, the performance of all the GCMs driving the Med-CORDEX RCSMs was characterised as “satisfactory” for downscaling, except that of LMD, which was found with biases.

4.4 MHW evolution and changes in SST

Present-day extreme warming at the order of \(SST_{99Q}\) might constitute a rare occurrence for the Mediterranean Sea climate; however, in the future it becomes the new normal. In 2071–2100 in particular, the warming signal is found so high that almost every day from June to October can experience such extreme temperatures. This means that future warming in the Mediterranean Sea is practically able to saturate what is considered today as a severe MHW. The difference between the scenarios lies in the fact that under RCP4.5 and RCP8.5 anomalous temperatures appear more persistent and widespread and therefore fewer but longer and more intense events occur. By contrast, under RCP2.6, events appear less persistent, and therefore more “breaks” between MHWs may occur (frequency of events is increased), since a significant part of the basin is more likely to fall below the \(SST_{99Q}\) threshold (Fig. 9).

Most of the future changes in MHW characteristics were seen to increase following the GHG forcing, yet this raises the question of whether this behavior could be explained by changes in the mean (shift of distribution) or the day-to-day SST variability (distribution flattening/narrowing). As a first indicator, we calculated (for the CNRM model only) the \({\overline{SST}}\) difference between RCP8.5 (2071-2100) and HIST and added it to the current \(SST_{99Q}\) threshold map (see Supplementary Table 1). The resulting climate change response (future-present) of MHW characteristics was much lower than the one found when using the initial \(SST_{99Q}\) threshold alone. This signifies that the mean SST change alone can explain a large part, but not all, of the future changes in MHWs. We estimate that 10–20% of the MHW characteristics are due to changes in day-to-day variability.

To further test our hypothesis, we calculated the multi-model mean ratio \(\hbox {R}=\varDelta ({SST_{99Q}}_{\mathrm{Scenario}}\)\({SST_{99Q}}_{\mathrm{Hist}})/ \varDelta ({\overline{SST}}_{\mathrm{Scenario}}\)\({\overline{SST}}_{\mathrm{Hist}}\)) for every scenario and period (see Fig. 13). In regions where \(\hbox {R}>1\) SST daily variability contributes to the extreme temperature increase and only where \(\hbox {R}>2\), it dominates the mean SST change contribution (distribution flattening). For \(\hbox {R}<1\), a narrowing of SST daily distribution lowers the mean SST signal, which makes the dominant contribution when close to R=1. Overall, model results indicate a higher contribution from SST daily variability change in the mid-21st century compared to 2071-2100, when \({\overline{SST}}\) change becomes more important (Table 4; Fig. 13). During 2021-2050 and for every scenario, basin-mean R\(\sim 1.2\) with the Alboran Sea, some coastal parts of the Aegean Sea, Adriatic Sea and SE Levantine basin exhibiting a narrowing (\(\hbox {R}<1\)) or a shift (R=1) in the SST distribution. Towards 2071-2100, however, and under RCP4.5 and RCP8.5, basin-average R is between \(1<\hbox {R}<1.2\) and more areas demonstrate a range closer to R = 1.

It is worth noting that narrowing of the SST distribution is a rare situation that appears in small areas only under RCP2.6 or around the Alboran Sea under RCP85 for the mid-21st century. Flattening, on the other hand, appears more common and could possibly reflect the increase in day-to-day variability of 2 m-air temperature estimated by Giorgi (2006) for the Mediterranean area in the future. Finally, a slightly stronger influence of SST daily variability is seen in the NW Mediterranean area for every scenario and period (Fig. 13). The possible explanations for such a spatial pattern could be a future mixed layer depth shoaling, as projected by (Adloff et al. 2015). This would mean that heat fluxes would be able to change the heat content of a shallower MLD faster, creating that way an increase in daily SST variability.

5 Conclusions

The main objective of this study is to investigate the future evolution (1976–2100) of SST and marine heatwaves in the Mediterranean Sea, using the best dedicated multi-model ensemble available. Here we examine six Regional Climate System Models from the Med-CORDEX initiative, driven by 4 CMIP5 GCMs under the RCP2.6, RCP4.5 and RCP8.5 scenarios. A quantitative MHW definition and detection method based on SST and on Hobday et al. (2016) approach is developed, targeting large-scale and long-lasting events, mostly in the warmer months. The algorithm uses a climatological 99th percentile threshold based on historical simulations (1976–2005) and takes into account a spatially-varying threshold. It delivers MHW metrics such as frequency, duration, mean and maximum intensity along with severity and maximum spatial coverage.

Spatiotemporal indices under a 1976–2005 (HIST) run reveal that the Med-CORDEX ensemble simulates the present MHW characteristics well, although it appears to underestimate the warming trends of \({\overline{SST}} \) and \(SST_{99Q}\) of that period with respect to observations from 1982–2012. The latter dataset yields an annual frequency of 0.8 events/year, with MHWs lasting a maximum of 1.5 months between July and September, while covering a maximum of 90% of the Mediterranean Sea surface. The longest and most severe event of that period corresponded to the MHW of 2003, which also demonstrates the highest mean intensity and maximum event coverage.

Analysis of future evolution shows that differences in the GHG forcing are reflected mostly towards 2071–2100, whereas uncertainty for the mid-21st century is dominated by the model uncertainty. Ensemble means by the end of the century demonstrate the highest \({\overline{SST}} \) (3.1 °C) and \(SST_{99Q}\) (3.6 °C) increase under RCP8.5 and lowest under RCP2.6 (mono-model). The corresponding warming for 2021-2050, however, is less pronounced under RCP4.5 (\(\sim 0.8\)°C/1°C ) and RCP8.5 (\(\sim 1\) °C/1.2 °C). In contrast, basinwide mean and extreme SST for RCP2.6 (\(\sim 1\) °C) does not differ significantly from mid- to end of 21st century.

By 2100, models project at least one long-lasting MHW occurring every year under RCP8.5 up to 3 months longer, and about 4 times more intense and 42 times more severe than today’s events. Their occurrence is expected between June and October, affecting at peak, the entire Mediterranean basin. In fact, with respect to the HIST run, MMM MHW frequency increases by a factor of \(\sim 1.6\) for RCP8.5 and RCP4.5 by 2021–2050 and slightly less than that towards 2071–2100 for both scenario. The equivalent CNRM comparison between the scenarios reveals a slightly greater frequency increase during 2071–2100 under RCP2.6 (by factor of 1.7) than under RCP8.5 and RCP4.5. Multi-model mean duration, on the other hand, is multiplied by a factor of 3.7 for RCP4.5 and 5.3 for RCP8.5 during 2071–2100. MHWs under RCP8.5 may also have an Imean 3.9 times as high as today’s event, while the equivalent increase under RCP4.5 and RCP2.6 is significantly lower (see Table 5). For 2021–2050, however, there is a higher convergence in the factor of increase in frequency (\(\sim \) 1.5x) duration (\(\sim \) 2.4x–2.7x), Imean (\(\sim \) 1.5x) and severity (\(\sim \) 5x–7x) between MMM of RCP4.5 and RCP8.5.

In general, MHWs become stronger and more intense in response to increasing greenhouse gas forcing and especially towards the end of the century. RCP2.6, however, shows a slight increase in MHW signatures with time but lower than RCP4.5 and RCP8.5. Note here that certain models demonstrate stronger climate change responses than others, likely due to the choice of the driving GCM rather than to the individual RCSM biases. Much of the MHW evolution is found to occur mainly due to an increase in the mean SST, but an increase in daily SST variability also plays a noticeable role. Complementary sensitivity tests also prove that a mean shift in SST distributions alone cannot be responsible for the futures changes in MHWs.

Overall, the MHW and SST changes predicted for the 21st century will clearly impact the vulnerable Mediterranean Sea ecosystems. What was encountered as widespread consequences from the MHW 2003 could become the “new normal”, since our analysis signified that future MHWs become longer and more intense than this event in the near future. Especially under RCP8.5 and 2071–2100, MHWs can become three times longer than the MHW 2003, with mean intensities three times higher. While RCP8.5 is the business-as-usual scenario, RCP2.6 is the closest to Paris agreement limits, which could offer a relative stability in both the SST increase and MHW evolution in the basin after the mid-21st century. MHWs exert a strong influence not only on marine ecosystems but also on marine-dependent economies and hence society. Therefore, more research is needed towards an improved mechanistic understanding of these events and their underlying physical drivers. In a constantly warming world, this information, along with projections of large-scale future MHW evolution, might help identify regions with a physical predisposition to these extreme occurrences. In combination with biogeochemical studies, more light could be shed on the full extent of the biological system risks related to MHWs.

Fig. 7
figure 7

Multi-model average anomaly of yearly \({\overline{SST}} \) (°C) with respect to the corresponding ensemble mean HIST of each scenario, for the near and far future. The RCP2.6 scenario has only one simulation (CNRM)

Fig. 8
figure 8

Multi-model average anomaly of extreme \(SST_{99Q}\) (°C) with respect to corresponding ensemble mean HIST (1976-2005) of each scenario, for the near and far future. The RCP2.6 scenario has only one simulation (CNRM)

Fig. 9
figure 9

Annual number of MHWs (Annual Frequency) for RCP8.5 (red) RCP4.5 (blue) RCP2.6 (green) HIST (grey) and observations (dashed black). Bold colors indicate the multi-model mean and shaded zones represent individual MHW events identified by the models. Years without MHWs are also included, with shaded areas reaching 0. RCP2.6 has only 1 simulation (CNRM)

Fig. 10
figure 10

Annual earliest starting (solid lines) and latest ending (dashed lines) day of MHW events for RCP8.5 (red) RCP4.5 (blue) RCP2.6 (green) HIST (grey) and observations (black). Bold colors indicate multi-model average values while lighter dots represent individual event dates

Fig. 11
figure 11

IDF (Imean, Duration, Frequency) plots display the total MHW number of every dataset, for every scenario, over 2021-2050 and 2071-2100. RCP8.5 and RCP4.5 include events from 5 simulations, while RCP2.6 from only 1 (CNRM) simulation. HIST run contains MHWs from the corresponding set of models each time. The number of MHWs is calculated over each 30 year period. For contrast purposes, the red box depicts the observed characteristics of MHW 2003 in the Mediterranean

Fig. 12
figure 12

Whisker diagram of (left) Severity (Icum) and (right) maximum surface coverage of every observed and simulated MHW during HIST, 2021-2050 and 2071-2100. Box plots illustrate minimum, 25th percentile, median, 75th percentile and maximum values of each variable for a given model, scenario and period

Fig. 13
figure 13

Multi-model mean ratio R of \(\varDelta \)\(SST_{99Q}\) (°C) over \(\varDelta \)\({\overline{SST}} \) (°C) for every period and scenario. Regions where \(\hbox {R}>1/\hbox {R}<1\) indicate regions where flattening/narrowing of SST distribution is detected in addition to the mean distribution shift. Where R \(\sim 1\) the \({\overline{SST}}\) increase can be considered as the main factor for MHW changes