1 Introduction

Marine heatwaves (MHWs) cause significant stress on marine ecosystems with considerable consequences for nature conservation, and societies that depend upon them economically, culturally, and socially. Recent warming events have shown that such extreme events are key drivers of biodiversity loss in the affected areas, as for example in the Caribbean (Boveid et al. 2022) and at the Great Barrier Reef in Australia (Wernberg et al. 2013). Though MHWs by definition are rare, their impact is substantial. Recent MHW events have resulted in mass coral deaths and bleaching (e.g. Normile et al. 2016; Hughes et al. 2017; Le Nohaïc et al. 2017) and biological regime shifts (Wernberg et al. 2016). Past decades of MHWs have shown an increasing trend in both frequency and duration (Oliver et al. 2018, 2019) and projections for the future are dreary, suggesting a transition to permanent heatwaves in many areas (Frölicher et al. 2018; Darmaraki et al. 2019; Hayashida et al. 2020). Higher frequency and more intense MHWs could also pose immediate threat to the integrity of coastal carbon stocks (Canadell et al. 2021). In other words there is a pressing need to further understand how MHWs impact ecosystems in a changing climate. To do so, we need reliable climate projections on a relevant and representative scale. Here we investigate to what extent a coupled atmosphere-ocean regional climate model can add value to simulations of MHWs, which are still mainly projected using coarse resolution global climate models.

Global climate models have shown that many occurrences of MHWs are associated with increased insolation and reduced ocean heat losses due to high atmospheric pressure and anomalously weak winds (Sen Gupta et al. 2020). However, current global models tend to have a spatial resolution too coarse to resolve the key processes driving MHWs. Higher spatial resolution in regional ocean models can potentially add value with their improved ability to resolve mesoscale dynamics. This has a potential to improve representation of MHWs. Indeed, a recent study conducted by Pilo et al. (2019) using higher resolution global models showed a weaker bias as resolution increased, however biases persisted in the best simulation, still simulating less frequent, longer lasting and weaker MHWs compared to observations. Guo et al. (2018) also finds that a higher resolution simulation results in a better representation of MHWs, but biases in frequency and intensity persist. This suggests that increasing model resolution is only part of the solution and model representation of key processes driving MHWs need to be improved.

A finer model resolution may improve the coupling between the upper ocean and lower atmosphere. This has a potential to improve model representation of MHWs since the upper ocean layer heat budget can be a major driver of MHWs (Holbrook et al. 2019; Vogt et al. 2022). For a more focused regional study, regional ocean models offer a cost-effective approach to increased model resolution in selected regions by downscaling the coarser resolution global models, compared to increasing resolution in a global model. However, regional ocean models cannot represent feedbacks between the atmosphere and ocean, and their use of coarse global atmospheric data at the surface means regional ocean models also lack important local-scale information at the air-sea interface. These limitations can be overcome with fully coupled regional atmosphere-ocean modelling systems such as the recently developed COAWST modelling system (Warner et al. 2010). COAWST provides an opportunity to address this important issue at the atmosphere-ocean interface. This coupled modelling system has been successfully used by numerous studies to investigate air-sea interactions during tropical cyclone genesis and evolution (e.g. Warner et al. 2010; Olabarrieta et al. 2012; Drews and Galarneau 2015; Mooney et al. 2016; Chen et al. 2018).

In this study we use COAWST to identify the added value of a coupled regional climate model in producing more reliable projections of MHWs. In particular we aim to understand the differences in regional and global model simulations and the local mechanisms behind such differences. Though large scale climate mode variations, such as El Niño, are also known to be an important factor (e.g. Holbrook et al. 2019; Sen Gupta et al. 2020) our study will focus entirely on local drivers. We start with a description of the model systems and metrics used in Sect. 2 before we present the experimental results in Sect. 3 and finally Sect. 4 elaborates on the results and concludes the study.

2 Model description, setup and MHW metrics

We used the Coupled Ocean–Atmosphere-Wave-Sediment Transport Modeling System version 3.6 (COAWST) (Warner et al. 2010) consisting of the ocean model ROMS version svn1013 and the atmospheric model WRF version 4.1.5. The coupling was handled by the Model Coupling Toolkit which exchanged data fields between the two models every hour. The coupling includes ocean surface stresses and surface net heat fluxes from the atmosphere to the ocean, and sea surface temperature (SST) from the ocean to the atmosphere (Warner et al. 2010). To avoid any potential regridding distortions during data transfer between the models we collocated the grids for the atmosphere and the ocean models.

2.1 Physical ocean model

The ocean component is the Regional Ocean Modeling System (ROMS) ( Shchepetkin and McWilliams 2005, Haid-vogal et al. 2008). It is a three-dimensional model, using a vertically stretched terrain-following coordinate, here configured with enhanced resolution in the upper ocean, and a horizontal curvilinear Arakawa C grid. ROMS enables many options for advection and mixing, here we have used the 3.-order upstream horizontal advection scheme and the 4.-order centered vertical advection scheme. For mixing we used a harmonic horizontal mixing scheme for momentum and the Mellor-Yamada 2.5 mixing scheme for the vertical mixing. We configured the ocean domain with 20 vertical levels and a 12 km horizontal grid spacing, collocated with the atmospheric domain. As our area of interest was in the Atlantic region we masked out the ocean grid points in the Pacific region to save computational resources. The Fennel model (Fennel et al. 2006, 2008, 2011), a bio-geochemistry model, were coupled to the ocean, but will not be explored further in the present study.

2.2 Atmospheric model

The atmospheric model component is the nonhydrostatic, quasi-compressible Weather Research and Forecasting model (WRF). It has terrain following vertical sigma coordinates and a horizontal curvilinear Arakawa C grid (Skamarock et al. 2019). The input file allows the user to configure the model with a large variety of parameterization schemes; we chose the Thompson microphysics scheme (Thompson et al. 2008), the new Kain-Fritsch parameterization scheme for the convection representation (Kain 2004), the Rapid Radiative Transfer Model for general circulation models (RRTMG) schemes for both long and short wave radiation (Iacono et al. 2008), the Yonsei (YSU) planetary boundary layer parameterization scheme (Hong et al. 2006) and the Noah multi-physics land surface model (Niu et al. 2011) to represent the surface and immediate subsurface physics over land. The atmospheric model was configured with 40 vertical levels, and a horizontal domain with 12 km grid spacing, collocated with the ocean model grid.

2.3 Initial and boundary conditions

Two downscaling experiments were initialized on the 1st of January 1991. We ran the first four years as spin up, resulting in an analysis period from 1st of January 1995 to and including 31st of December 2014. The reanalysis driven COAWST\(_{ERA5}\) (Pontoppidan et al. 2023a) ocean module was initialized with daily data from HYCOM (http://tds.hycom.org/thredds/catalog.html) and the lateral boundary conditions were derived from the 5-day averaged SODA3 (Carton et al. 2018). The reasoning behind the use of different ocean initialization and driving dataset was stability issues in the COAWST model when initializing with SODA. The atmospheric module was initialized and driven by the ERA5 reanalysis on a 3-hourly timestep (Hersbach et al. 2020). COAWST\(_{NorESM2-MM}\) (sometimes further abbreviated to COAWST\(_{NE2-MM}\)) (Pontoppidan et al. 2023b) was entirely driven by the NorESM2-MM model simulation described below. The regional domain simulated is shown in Fig. 1, Table 1 provides an overview of all analysed simulation resolutions, including regional and global models.

Lateral boundary forcing in COAWST ocean components were nudged towards the monthly forcing throughout the integration in a wider boundary zone. We also nudged the temperature, salinity, and momentum along the coastline of the American mainland to mitigate an encountered problem of excessive upwelling and erroneous cooling in coastal areas. Radiation conditions were used for the three-dimensional velocities and tracers. Tides were not included in the model as they are generally low and irregular in our region of interest, increase model complexity, and add considerable computational expense. Also the river input was excluded to reduce the complexity of the model; we expect this to have only minor impact on the results since the coastline is nudged towards the driving data. Nudging were not applied in the atmospheric interiors, only in the boundary zone using the default settings of WRF.

2.4 Global models

To address the added value of the regional model we analyzed two coupled versions of the global Norwegian Earth system model version 2 (NorESM2): the first has a spatial resolution of 1\(^{\circ }\) in both the atmosphere and the ocean (NorESM2-MM), and the second has 1\(^{\circ }\) and 0.25\(^{\circ }\) resolution in the atmosphere and ocean, respectively (NorESM2-MH). In addition, we also analyzed an ocean-only reanalysis driven NorESM version 1 simulation (NorESM-OC), which is a global ocean-only 1\(^{\circ }\) resolution simulation (Schwinger et al. 2016). All global models provided daily SST values, and monthly full ocean depth temperatures and salinity. The NorESM2-MM version is part of the CMIP6 (Coupled Model Intercomparison Project phase 6) ensemble and contributed to the latest Intergovernmental Panel for Climate Change Assessment Report (IPCC-AR6). A thorough analysis of the NorESM2-MM model performance in simulating the present day large-scale mean climatology and variability can be found in Seland et al. (2020), Tjiputra et al. (2020). A Caribbean focused analysis of selected atmospheric variables is available in the supplementary material.

2.5 Observational and reanalysis data

The observational dataset for SST is the CoralTemp SST v3.1 (CRW) product from the CoralReefWatch suite (crw 2014; Skirving et al. 2020). The dataset is a 5 km gridded nighttime ocean surface temperature product based on multiple satellite SST analyses and an ocean reanalysis. When full ocean depth data is needed we used SODA3.7.2 data which is a reanalysis build on the ocean component of the Modular Ocean Model version 5. It has a 0.25\(^{\circ }\) horizontal grid and 50 vertical levels with enhanced resolution in the upper 100 ms. The reanalysis assimilates both in situ and satellite SST observations in addition to profile hydrography data (Carton et al. 2018). The ERA5 reanalysis (Hersbach et al. 2020) is used as ground truth for the atmospheric variables. ERA5 uses grid spacings of 31 km and produces hourly output which has been used here to calculate daily values for this study.

Table 1 Overview of the spatial resolutions and the analyzed simulation period of the analyzed observational SST (CRW) and the models (COAWST, NorESM-OC, and NorESM2s) used in the study

2.6 Marine heatwave metrics

To assess MHWs we followed the definition of Hobday et al. (2016, 2018), where a MHW was defined as a period of at least five consecutive days with temperature above a 90. percentile climatology. If a cold period of two days or less separates two heatwaves, the full period was concatenated and counted as one event, including the cold gap. Following Hobday et al. (2016) we based our heatwave calculations on smoothed daily SSTs, using a centered 11-day window. The climatology 90. percentile was a 30-day smoothed percentile calculated on the analyzed 20-year period 1995–2014, resulting in a daily varying threshold to allow for MHW during all seasons. The MHW intensity was calculated as the difference between the smoothed SST and the daily climatology during active MHWs. Our calculation of seasonal length and frequency was categorized based on start date of the detected MHW. This means that if a MHW starts on 28. February it is categorized as a winter heatwave, though it expand into the spring. The MHW detection algorithm was downloaded from https://github.com/ecjoliver/marineHeatWaves and modified to handle spatial grids.

2.7 Temperature tendency

The mixed layer heat budget can be written as:

$$\begin{aligned} \frac{Dc_p\rho h T}{Dt} = Q_{0} - Q_{-h} = Q_{net} \end{aligned}$$
(1)

where \(\frac{D}{Dt}\) is the material derivative, \(\rho\) is seawater density, \(c_p\) is specific heat of seawater, h is the mixed layer depth, T is the mixed layer temperature, and \(Q_{0}\) is the net air-sea heat flux at the surface and \(Q_{-h}\) is the net radiative (shortwave penetration) and diffusive (molecular) heat flux at the base of the mixed layer. The difference in heat flux through the top and the bottom of the mixed layer is then the net heat flux, \(Q_{net}\). Previous studies has shown that the surface heat flux term (\(Q_0\)) dominates the mixed layer heat budget by changing the temperature (Montoya-Sánchez et al. 2018). Therefore we simplify by assuming \(c_p\) and h are constants and that \(Q_{-h}<<Q_{0}\), and after reorganizing we can write:

$$\begin{aligned} \frac{\partial T}{\partial t} \approx \frac{Q_{0}}{h c_p\rho }, \end{aligned}$$
(2)

recognizing that the rate of change in temperature is proportional to the surface heat flux and inversely proportional to the depth of the mixed layer.

3 Results

First we evaluate model performance with regards to daily SST. This insight is important because the detection and evaluation of MHWs are based on SST and its variance in the upper ocean layer. Global climate models generally represent the mean climatology well but their coarse resolution often results in lower variability, both in space and time (Costa and Rodrigues 2021). This is also the case here. SST mean climatology is comparable in the simulations (Fig. 1a–f) with an overall small warm difference compared to CRW. This difference is not surprising, and the models are perhaps more correct than CRW as satellite products have been shown to have a cool bias compared to in situ observations in the area (Gomez et al. 2020). Larger model-specific biases arise when examining temporal variability. Figure 1g–l shows the daily standard deviation of the surface temperature from CRW and the differences between this calculated standard deviation and the standard deviations in COAWST\(_{ERA5}\), the NorESM-OC, COAWST\(_{NorESM2-MM}\), the NorESM2-MH and the NorESM2-MM simulations, respectively. CRW SSTs are based on nighttime-only and have also been shown to underestimate the nighttime SST variability (Gomez et al. 2020) we therefore expect a higher standard deviation in the model simulations than in the CRW dataset. That is not the case for the coarser NorESM models, they show less variability – particularly in the Gulf of Mexico and along the US eastern coast. The NorESM-OC simulates the lowest temporal variability, whereas the two higher resolution COAWST models simulate larger temporal variability in the SSTs.

Fig. 1
figure 1

Mean sea surface temperatures in a the CRW and the sea surface temperature biases for b COAWST\(_{ERA5}\), c NorESM-OC, d COAWST\(_{NorESM2-MM}\), e NorESM2-MH and f the NorESM2-MM simulation. Standard deviation of daily sea surface temperatures in g the CRW and the standard deviation biases for h COAWST driven by ERA5 and SODA, i NorESM-OC, j COAWST driven by NorESM2-MM, k NorESM2-MH and l the NorESM2-MM simulation. The red box shows the region subject to further analysis in this study

Our next step is to investigate MHW metrics. To ensure a fair comparison despite different internal model climatologies, each set of metrics are calculated based on the SSTs within the dataset itself, this means that the heatwaves detected are relative to the climatology in the specific dataset, and consequently, the threshold values are not equal in all datasets, but instead the 90. percentile based on the SSTs in the respective dataset. Figure 2a–f shows the average intensity of the detected heatwaves in CRW and the biases of the intensities in the simulations. Both COAWST simulations slightly overestimate the intensity whereas the NorESM simulations have more diverse bias signals with areas of both over and underestimation, but generally represent the observed mean intensity well. The differences between the regional and global models are more pronounced when we focus on MHW frequency in Fig. 2g–l. COAWST simulates frequency well whereas NorESM simulations tend to underestimate the number of heatwaves in most of the study domain, particularly in the coarser models. Duration of heatwaves (Fig. 2m–r) is better represented in the COAWST simulations compared to the NorESM simulations which have issues with longer heatwaves. Again the largest model errors arise in the coarser global models.

Fig. 2
figure 2

Calculated degree heating weeks per year is shown as intensity of CRW in a, and the differences in intensity for b COAWST\(_{ERA5}\) c NorESM-OC, d COAWST\(_{NorESM2-MM}\), e NorESM2-MH and f NorESM2-MM simulations. Similar for the averaged frequency of heatwaves per year; CRW in g and the differences in h–l and for the averaged length of a CRW heatwave in m and the differences in n–r

To further disentangle the biases we now focus on seasonality, categorized according to MHW start date, and compare the frequency and duration of the MHWs. Figure 3a shows the average frequencies of the detected MHWs in a selected box over the Caribbean Sea (10°N–20°N,80°W–55°W, red box in Fig. 1) for each season. Particularly during the winter season the number of heatwaves detected is substantially low in the NorESM models compared to CRW, while both COAWST simulations are able to represent the frequencies well. For reference we have also included a CMIP6 ensemble mean consisting of 9 single model realizations from different modeling centres which shows similar winter bias as the NorESM simulations (see appendix for the list of specific models). The opposite is the case when we examine the lengths of the MHWs (Fig. 3b), the NorESM heatwaves are too long, especially during winter. This means that the number of MHWs detected in the NorESM simulations is markedly underestimated, but when a heatwave arise, it is considerably longer than in the observational datasets, similar for the CMIP6 ensemble mean. The temporal distribution of MHWs are much better represented in both COAWST model simulations. COAWST\(_{ERA5}\) represents the observations very well, and COAWST\(_{NorESM2-MM}\) reduces the bias from the driving NorESM2-MM model considerably.

Fig. 3
figure 3

Seasonal distributions of MHWs. In a the frequency and in b the length of MHWs in the observational dataset CRW, the COAWST\(_{ERA5}\) simulation, the NorESM-OC, the COAWST\(_{NorESM2-MM}\) (abbreviated COAWST\(_{NE2-MM}\)), the NorESM2-MH, the NorESM2-MM and an ensemble mean of nine CMIP6 models

We now investigate the temporal climatology of the upper ocean vertical density profile in the selected box over the Caribbean Sea. Figure 4 shows the 20 year climatology of density profiles in SODA and in the investigated simulations for comparison. The models are showing a monthly climatology of density data, due to the lack of data availability. COAWST models are interpolated to NorESM model depth levels using bilinear interpolation. The black line represents the monthly MLDs calculated based on the density criteria of 0.03 kg m\(^{-3}\). Figure 4 shows that the stratification is stronger in the COAWST simulations than in the NorESM simulations. This leads to a too deep representation of the MLD in the NorESM models—-yearly average is 43 ms, 41 ms and 45 ms in NorESM2-MH, NorESM2-MM and NorESM-OC respectively, while the COAWST simulations compare well with SODA with a yearly averaged MLD of 26 ms in COAWST\(_{ERA5}\), 27 ms in COAWST\(_{NorESM2-MM}\) compared to the MLD of 29 ms in SODA. The MLDs in the NorESM models are considerably deeper than MLD in SODA and COAWST, particularly during late autumn and winter when the bias in the frequency and length of MHW are most pronounced (see Fig. 3). Also, the interseasonal variations of the MLD are considerably larger in NorESM simulations than in COAWST simulations. One may speculate that these biases are influenced by the bathymetry in the models, however the average depths in the examined box are all considerably deeper than the mixed layer; 2895 m, 3304 m and 3176 m in the COAWST, NorESM2-MH and NorESM2MM/OC simulations respectively.

Fig. 4
figure 4

Vertical density profiles with the mixed layer depth climatology in black line for SODA, the COAWST\(_{ERA5}\) (abbreviated C\(_{ERA5}\)), the NorESM-OC (abbreviated NE-OC), the COAWST\(_{NorESM2-MM}\) (abbreviated C\(_{NE2-MM}\)), the NorESM2-MH (abbreviated NE2-MH), the NorESM2-MM (abbreviated NE2-MM)

In an effort to disentangle the mechanisms behind the distinct model differences we separate the detected heatwave days into increasing and decreasing days, i.e. days where the heatwave intensity is increasing and days where the intensity is decreasing - independent of the day of maximum intensity. This analysis could not be performed with the NorESM-OC as it does not have daily flux data. Therefore, we focus on the coupled models only for the remainder of this section. Figure 5 focuses on December–February, the season with the largest bias and model differences with regards to lengths and frequencies. It is clear that the temperature tendency has larger magnitudes in the COAWST simulations (Fig. 5a–b and Fig. 5f–g), than in the NorESM simulations (Fig. 5k–l and Fig. 5p–q), both during increasing and decreasing heatwave days. Mechanisms behind these differences are investigated by examining the temperature tendency equation (Eq. 2). The latent heating dominates the net air-sea flux in all simulations and influences the difference between models during increasing days. An increase in 2 m humidity and a decrease in 10 m wind speed relative to season climatology leads to less latent heat release (not shown). The reduced latent heat release is clear in all models (Fig. 5c,h,m,r), but much more so in the COAWST simulations. During decreasing heat wave days the latent heat release is of similar magnitude in all models (Fig. 5d,i,n,s). Another major contributor to model differences are the depth of the mixed layer which we show as the seasonal mean in (Fig. 5e,j,o,t). A deeper mixed layer in the NorESM models results in a higher heat storage capacity and consequently a slower warming and cooling ocean. The mixed layer depth is considerable shallower in the COAWST models and its mixed layer energy grows and dissipates faster than in the NorESM models.

Fig. 5
figure 5

COAWST\(_{ERA5}\) simulated SST tendency for increasing and decreasing days are shown in panels a and b, respectively. The red box shows our focus area in the Caribbean Sea. Panels c and d show the latent heating for the increasing and decreasing heatwave days. Panels e mean mixed layer depth. Likewise in Panels f–j for the COAWST\(_{NorESM2-MM}\), panels k–o for the NorESM2-MH and panels p–t for the NorESM2-MM simulations. All panels show December—February climatology

4 Discussion and concluding remarks

In this study we investigated MHWs and the added value of their representation in a higher resolution regional coupled model. We utilized the coupled atmosphere-ocean model COAWST and compared the outcome with three existing global model simulations with respect to the representation of SSTs and MHWs in the Caribbean Sea where multiple coral reef ecosystems are under stress (Boveid et al. 2022), and where only minor changes can have major effects as seen in e.g. Australia (Schoepf et al. 2015).

For SST, the global NorESM models generally perform well with regards to mean climatology, however the coarse resolution in global models precludes a realistic representation of regional and local scale phenomena. Here the higher resolution regional simulations leads to a higher spatial and temporal variability, particularly in areas where complex regional and local scale features dominate, e.g. the Caribbean Sea, the Gulf of Mexico and along the eastern US coast.

The MHW metrics, calculated on each individual dataset, reveal substantial biases in the global models when it comes to frequency and duration; the global models simulate too few heatwaves that last too long. In comparison, the coupled regional simulations demonstrated added value through their considerably better temporal distribution of heatwaves with regards to frequency and length representation. This is consistent with findings in Pilo et al. (2019) who suggests that higher spatial resolution improves the representation of MHW duration and frequencies. However, the relatively smaller improvement from NorESM2-MM to NorESM2-MH compared with the regional models demonstrates that the added value is not obtained by simply increasing the horizontal resolution. Instead, it is necessary to use coupled regional models with parameterization designed for resolving regional and local scale dynamical features or alternatively resolution that enables explicit representation.

Our ocean focus area, the Caribbean Sea, is an area with strong dynamic coupling between ocean and atmosphere. A correct representation of such regional and local scale dynamics is important to represent the temporal distribution of MHWs better. Our study shows that the slower increasing and decaying surface temperatures, leading to longer and fewer heatwaves in the global models, are driven by atmospheric and oceanic processes at the air-sea interface. When MHW intensity is increasing, dissimilarities arise from differences in surface heat fluxes, dominated by a reduction in latent heat release due to lower wind speeds, while the shallower, but more realistic ocean mixed layer in the COAWST simulations leads to a faster increasing and decreasing SST during heatwaves. Also previous studies have suggested surface fluxes, weaker advection (Bond et al. 2015; Sen Gupta et al. 2020; Vogt et al. 2022) and extremes in the ocean mixed layer temperature tendency budget (Holbrook et al. 2019) as reasons for MHWs onset and decline. However, lack of daily data inhibits a further disentangling of the heat budget in this study and we highly recommend future studies to ensure full 3D ocean data on a daily basis to allow for more detailed investigation.

While global models have widely been used to assess MHWs and its impact in both present day and in future we recommend that the added value of regional models, demonstrated in this study, is considered. Our findings emphasizes the need for higher resolutions when assessing future projections of MHWs, and the use of models that better represent air-sea interactions and local dynamics to improve understanding of MHWs and their response to global warming. Particularly for impact studies we emphasize the necessity to use the best possible representation of MHW metrics to ensure reliable insights and address consequences of the physical changes on the ocean ecosystem. Coral reef and ocean biodiversity management are two of many examples of services that are in urgent need of reliable MHW metrics to gain further knowledge of the impact of climate change and the potential options to mitigate and adapt. In the context of blue carbon mitigation action, improved dynamical understanding and projections of MHWs in the coastal is also critical to assess the integrity of coastal carbon stocks in the future (Canadell et al. 2021).