Article Highlights

  • GEWEX-EEI compares 21 ocean heat content time series from reanalysis, in situ and satellite observations

  • We find substantial spread in ocean heat uptake and variable skill in tracking radiometric EEI variability

  • Follow-on investigations and recommendations are proposed to reconcile estimates and their uncertainties

1 Introduction

Detection and understanding of climate change rely on research investigations by a broad international science community utilizing a vast array of climate models and Earth observations ranging from in situ measurements made in the deep ocean to satellite measurements made at the top of the atmosphere (TOA). Improvements to our understanding of the state of Earth’s climate and projected future changes are communicated to society through several governmental channels such as the climate assessments of the Intergovernmental Panel on Climate Change (e.g., IPCC 2021; Forster et al. 2021; Gulev et al. 2021) or the annual World Meteorological Organization (WMO) State of the Global Climate Reports, which in 2022 (WMO 2023) focused on key climate indicators, such as greenhouse gasses, temperature, sea level rise, ocean heating and acidification, sea ice and glacier melt. Identifying and examining indicators that robustly and comprehensively measure change to the climate and the impact of current and future anthropogenic activities is a crucial aspect of the first Global Stock Take under the Paris Agreement (Peeters 2021; Forster et al. 2023).

The most holistic picture of heat accumulation by the Earth system is gained by quantifying and assessing change in Earth’s radiative Energy Imbalance (EEI) at the TOA, representing the cumulative effect of radiative forcings and feedbacks. At present, Earth’s heat uptake or inventory, respectively, serves to constrain EEI absolute magnitude over decadal timescales (e.g., von Schuckmann et al. 2016, 2023; Trenberth et al. 2016; Johnson et al. 2016; Meyssignac et al. 2019; Cheng et al. 2022a, b). According to the latest GCOS assessment of Earth’s heat inventory, the absolute magnitude of EEI for the period 2006–2020 is 0.76 ± 0.2 W m−2, which combines ensemble estimates of ocean heat uptake (OHU), terrestrial as well as atmospheric heat storage, and the heat energy required to melt land and sea ice and evaporate water to increase atmospheric moisture content (von Schuckmann et al. 2023). About 90% of Earth’s heat surplus is stored in the ocean; hence, assessments of EEI magnitude and uncertainty largely depend on accurate OHU estimates. To date, standard approaches to estimating OHU (e.g., Hakuba et al. 2021; Cheng et al. 2022a, b) are: (a) to derive ocean heat content (OHC) changes from direct subsurface ocean temperatures observations through hydrographic profiles (e.g., Levitus et al. 2012); (b) to derive the oceans' thermosteric expansion through sea level budget assessment using geodetic observations from space (e.g., Marti et al. 2022); and (c) to estimate the ocean state using global ocean models and reanalyses that assimilate various ocean and atmosphere observations (e.g., Balmaseda et al. 2015; Forget et al. 2015; Zuo et al. 2019).

Since circa 2005, when Argo reached critical spatiotemporal coverage, experts across the globe have been examining Argo profiling float array observations (Riser et al. 2016) along with other ocean in situ observations (e.g., reviewed in Meyssignac et al. 2019; Cheng et al. 2022a, b) to estimate global OHC time series. More recently, geodetic satellite observations have been used for estimating OHU and its variability (Marti et al. 2022; Hakuba et al. 2021). Despite the lack of vertical resolution and requiring independent knowledge of seawater’s heat expansion efficiency, the combination of near-global satellite-based estimates of ocean mass and total sea level change has proven successful in matching positive trends in EEI and its year-to-year variability with Clouds and Earth's Radiant Energy System (CERES) Energy Balanced and Filled (EBAF) net radiative flux at the TOA (Marti et al. 2022; Hakuba et al. 2021). Reanalyses that assimilate ocean observations into physically consistent ocean models also provide OHU together with complete ocean state estimates, the latter enabling the study of heat exchanges, distributions and their causes (Storto et al. 2019).

Beyond being a fundamental metric for tracking change in Earth’s climate, EEI also represents a target value in global climate model tuning (Cheng et al. 2016a, b; Hourdin et al. 2017; Smith et al. 2015; Schmidt et al. 2023) and serves in constraining Earth’s equilibrium climate sensitivity (e.g., Sherwood et al. 2020; Chenal et al. 2022) and climate feedback parameter (Meyssignac et al. 2023a). Improved understanding of EEI and a robust estimate of its uncertainty is vital in reconciling global energy and water cycles using consistent budgets and optimization processes (L’Ecuyer et al. 2015; Roberts et al. submitted to this issue), which is one of the key goals of the Global Energy and Water Exchanges (GEWEX) community (Stephens et al. 2023). EEI monitoring alone is not sufficient for understanding the implications that natural and anthropogenic composition changes impose on Earth’s energy budget. Characterizing and attributing EEI changes and their drivers on seasonal (Johnson et al. 2023a, b; Pan et al. 2023) as well as interannual and longer timescales (Cheng et al. 2019a, b; Loeb et al. 2021, 2024, this issue; Stephens et al. 2022) require extensive use of ancillary surface and atmospheric property information as well as radiative transfer and global climate models to help interpret the role of climate forcings and climate feedbacks (e.g., Raghuraman et al. 2021). Although EEI is a global metric of Earth system change, regional investigations of ocean heat distribution and transports (e.g., Meyssignac et al. submitted to this issue; Trenberth et al. 2019) as well as the processes driving heat uptake in the atmosphere and at the surface (Mayer et al. 2024, this issue) are key aspects of holistic EEI assessments. One way to reconcile data from all available observing networks and their scale of variability is through Earth system reanalysis capable of ingesting all observational information, i.e., TOA radiation, OHC, altimetry, gravimetry (Stammer et al. 2016; Storto et al. 2017; de Rosnay et al. 2022).

Due mainly to calibration and retrieval uncertainties that are one order of magnitude larger than EEI itself, EEI absolute magnitude cannot be derived from radiometric observations such as provided by the Clouds and Earth's Radiant Energy System (CERES; Wielicki et al. 1996) and the Solar Radiation and Climate Experiment (SORCE; Rottman 2005), unless adjustments are made to match the global net radiative flux to independent estimates of long-term planetary heat uptake (Loeb et al. 2018). These adjustments do not affect the time variability nor trend in radiometrically derived EEI significantly. Figure 1 contrasts the EEI time series from single-scanner Terra data with the energy balanced (EBAF) multi-platform (Terra, Aqua, NOAA20) climate data record. Although the Terra record suggests unrealistically high EEI absolute magnitude (time series mean), irreconcilable with heat uptake estimates and current knowledge of radiative forcings and feedbacks that shape EEI (Loeb et al. 2009a), both show near-identical time variability and trend over March 2000 to July 2023. The offset correction in EBAF is commensurate with an estimate of EEI absolute magnitude at 0.71 W m−2 over 2005–2015 according to Johnson et al. (2016).

Fig. 1
figure 1

EEI (12-month running mean) derived from CERES observations. The CERES single-scanner net radiative flux from Terra (blue) provides unrealistically high EEI, irreconcilable with heat uptake estimates and current knowledge of radiative forcings and feedbacks that shape EEI (Loeb et al. 2009a). EBAF combines data from multiple platforms (Aqua, Terra, NOAA20) and is anchored to an independent estimate of EEI from heat uptake analysis (Loeb et al. 2018)

Although satellite measurement absolute accuracy is insufficient to close the TOA energy balance, the unparalleled measurement precision and stability of current CERES and future Libera measurements allows for the study of EEI time variations and trends. As of 2021, the trend in EEI over 2005–2019 is 0.50 ± 0.47 W m−2 dec−1 (5–95% confidence intervals) according to CERES EBAF analysis, indicating an approximate doubling of EEI in this 14-year period (Loeb et al. 2021; Cheng et al. 2024a). This tendency agrees with global OHU from combined in situ and altimetry measurements (Loeb et al. 2021), as well as satellite-based OHU time series (Marti et al. 2022; Hakuba et al. 2021).

Although comprehensive assessments of EEI and OHU exist (e.g., von Schuckmann et al. 2020; 2023; Meyssignac et al. 2019; Cheng et al. 2019b, 2022a), there is a lack of systematic intercomparison across different methods and products. Methodological differences and assumptions across in situ-based OHC estimates alone are various and range from different data sources, quality control procedures, mapping/interpolation techniques and prior statistics, to mathematical assumptions in the derivation of OHC and OHU from temperature and salinity profiles, and sampling considerations (Boyer et al. 2016; Cheng et al. 2016a, b, 2022b). Furthermore, measurements of OHC (geodetic or in situ) and net radiative flux (satellite) represent inherently different temporal and spatial scales due to different sampling frequencies and processing, e.g., spatial and temporal interpolation, which need to be better understood and quantified. Recognizing the need for a better understanding of the data products available, their discrepancies, uncertainties as well as sources thereof, the WCRP initiated the formation of the GEWEX-EEI assessment working group to intercompare different OHC and OHU estimates in conjunction with other sinks of heat (e.g., atmospheric heat storage) and EEI variability from radiation budget data. GEWEX-EEI aims to (1) facilitate OHC and OHU intercomparison on global and regional scales by the international community, (2) formulate recommendations and best practices to enable “apples-to-apples” intercomparison across products and their uncertainties, (3) improve EEI central and uncertainty estimates, and (4) improve understanding of EEI variability. In spring 2023, the community met for the first time at a joint WCRP-ESA EEI assessment workshop at ESA-ESRIN, Frascati, Italy, bringing together experts in radiometric remote sensing, satellite altimetry, space gravimetry, ocean in situ measurements and ocean reanalysis/ocean state modeling to assess and intercompare estimates of EEI, their time variability and uncertainties (Meyssignac et al. 2023b). Key findings and recommendations are presented and discussed in Sect. 4.

As part of this EEI assessment effort, the community was asked to share global annual mean OHC series as produced in-house at respective institutions to highlight the spread incurred by different approaches. Based on data available at the time of analysis, this paper intercompares the different OHC estimates, their variability and their trends expressed as OHU in TOA-equivalent W m−2 together with uncertainties (Sect. 3.1). Comparison of OHU interannual variability and CERES global net radiative flux (EEI) is presented in Sect. 3.1. Section 3.3 assesses zonal trends in net radiative flux using CERES EBAF data to elucidate regional variations and their co-variability with trends in clear-sky and cloud radiative effects as well as cloud properties.

The critical need to enable seamless climate continuity observations from space has been recognized by the international satellite community (KISS Continuity Study Team 2024) and both GEWEX-EEI and the Global Climate Observing System (GCOS) Working Group on energy cycle closure recommend addressing looming gaps. Section 3.2 demonstrates the impact of data gaps on OHU estimates and EEI trends, highlighting the need for global coverage and seamless continuity to ensure trend estimates are reliable and trend uncertainties remain as low as possible.

2 Methods and Data

2.1 Approach

To intercompare OHU estimates and trends over 2005–2020, we analyze an ensemble of 21 OHC datasets, comprised of ten OHC datasets based on in situ observations, two satellite-based OHC datasets combining satellite altimetry and space gravimetry measurements, three hybrid methods combining in situ measurements with satellite information and six ocean reanalysis products. OHC estimates from in situ data originate from EN4 (Good et al. 2013), the In Situ Analysis System (ISAS20; Gaillard et al. 2016), Scripps Institution of Oceanography (SIO, Roemmich and Gilson 2009), Cheng and Zhu analysis (IAP, Cheng and Zhu 2016; Cheng et al. 2017; Cheng et al. 2024b), NOAA NCEI (Levitus et al. 2012), PMEL (Lyman and Johnson 2008), JAMSTEC (Hosoda et al. 2008), LocalGP (Giglio et al. 2024), JMA (Ishii et al. 2017) and von Schuckmann and Le Traon (2011), hereafter vS&LT. Satellite-based OHC is provided by Jet Propulsion Laboratory (JPL, Hakuba et al. 2021) and Legos-Magellium (Marti et al. 2022). The hybrid estimates originate from CNR-ISMAR (Storto et al. 2022) and PMEL, namely from PMEL-combined (Lyman and Johnson 2014) and RFROM (Lyman and Johnson 2023) analyses. The six reanalysis datasets considered are ECCOv4 (Forget et al. 2015), OCCA2 (Forget 2024), SODA3 (Carton et al. 2018), CIGAR (Storto and Yang 2024), ORAS-5 (Zuo et al. 2019) and the Copernicus Global Reanalysis Ensemble Product (GLORYS2V4, Lellouche et al. 2018; ORAS-5; C-GLORSv7, Storto and Masina 2016). The datasets are summarized in Table 1 and described in more detail in the supplementary information (SI T1).

Table 1 List of all data products considered in analysis and discussion, including product name, reference and keywords describing the source data and methods

Many of these datasets were provided to the GEWEX-EEI assessment via https://sites.google.com/magellium.fr/eeiassessment/data-records/, while ECCOv4, SIO, JAMSTEC, SODA3, ORAS-5, Copernicus, OCCA2, CIGAR, vS&LT and RFROM data were either directly provided by the data producer or downloaded from the data producer website. Most of the OHC datasets were provided as global annual averages (JPL, Legos-Magellium, ECCO, ORAS-5, CNR-ISMAR, EN4, ISAS20, NCEI, PMEL, PMELc, LocalGP, Ishii, OCCA2, IAP, vS&LT, Copernicus, CIGAR) and/or as monthly gridded data (SIO, ORAS-5, SODA3, JAMSTEC, RFROM, ECCO, Legos, EN4, IAP, Ishii, NCEI). From monthly OHC, we compute annual averages based on bin-averaging of the monthly values. Gridded datasets are globally integrated to obtain global values in J. All OHC products are converted to ZJ and interpolated in time such that each annual mean is centered on the middle of the year. SODA3, JAMSTEC and SIO do not provide OHC datasets but gridded temperature and salinity profiles. For those, we derive the OHC from the temperature and salinity by vertical integration of the specific heat of seawater multiplied by the local density of seawater and the oceanic temperature using the TEOS-10 GSW software (McDougall and Barker 2011) as, e.g., in Melet and Meyssignac (2015).

Some products only cover the ocean down to 2000 m depth (LocalGP, PMEL, PMELc, ISAS, NCEI, EN4, Ishii, SIO, JAMSTEC, RFROM, vS&LT). For consistency across OHU estimates, we add a constant < 2000 m heating rate of 0.06 W m−2 ± 0.04 W m−2 (Purkey and Johnson 2010; Johnson et al. 2023b). OHU time series are derived from the time derivative of OHC, using centered differences, which applies a light filtering of the data, slightly reducing the noise compared to first differences, and are normalized by Earth’s surface area at the TOA: 5.14 × 1014 m2 at 20 km above the Earth's surface.

The trend and acceleration of OHC are calculated using an ordinary least squares (OLS) estimator. Uncertainties in the trend and acceleration are given by the variance of the estimator which is derived from each dataset’s OHC uncertainties. For some of the OHC datasets, at the time of manuscript submission, uncertainty estimates were not available, and the OHU and OHU trend uncertainties are derived from the linear fit. Some other OHC datasets (CNR-ISMAR, EN4, NCEI, PMEL, ECCO, Copernicus, OCCA2, CIGAR, IAP, ORAS-5, vS&LT, ISAS, LocalGP and Ishii) provide annual estimates of uncertainty or ensemble spread without providing the time correlation across annual uncertainties; hence, their trend and acceleration uncertainties ignore any potential temporal correlation effects. A few OHC datasets (JPL and Legos-Magellium) provide a variance–covariance matrix that describes the annual uncertainties and their time correlation.

As part of our analysis, we investigate the impact of ocean sampling discrepancies on the OHU estimates (or OHC trend in W m−2), OHU trends (or OHC acceleration in W m−2 dec−1) and OHU correlation (R) with CERES EBAF net radiative flux for 12 gridded OHC datasets made available to us. We apply a restrictive ocean sampling (ROS) mask that covers ocean areas common to all products (SI, Fig. S1), largely masking shelf areas, coastal areas, shallow seas (> 300 m), marginal seas and polar oceans beyond ± 60° latitude, which are generally not sufficiently sampled by Argo profiling floats. In addition, we test the impact of a mask limited to areas covered by satellite altimetry (i.e., limited to ± 66° latitude) to obtain first-order estimates of sampling uncertainty.

2.2 In Situ Observations

Historically, ship-based observations have been the main source of subsurface ocean temperature information. The Argo Program, designed in 1998 (Argo Science Team 1999; Roemmich et al. 2009), was transformational for subsurface ocean observing, enabling high-quality data (Wong et al. 2020) to be obtained nearly anywhere in the ocean, thus reducing geographical and temporal (seasonal) biases of ship-based observation systems. Argo first achieved significant coverage in both hemispheres in 2005 and reached its initial goal of 3000 profiling floats in November 2007. Its present coverage of about 3880 floats is close to the target of 4000 and is becoming more prevalent in marginal seas, seasonally ice-covered regions and the deep ocean below 2000 m (Jayne et al. 2017). Argo’s near-global uniform coverage has resulted in a dramatic reduction of the uncertainty of global OHC changes. Other subsurface observing systems contribute significant subsurface temperature data from 2005 to present, and in some cases, they are the main source of data (mainly in areas shallower than 2000 m depth, marginal seas and ice-covered areas). Observations from research ships (mainly conductivity–temperature–depth—CTD—casts) and ships of opportunity (mainly expendable bathythermographs—XBT drops from merchant ships), moored buoys (especially the tropical moored buoy arrays in the Pacific, Atlantic and Indian Oceans), ice-tethered profilers (in the high Arctic), gliders (mainly on and near continental shelves) and even instrumented pinnipeds all augment and extend the observations provided by the Argo array (Abraham et al. 2013; Meyssignac et al. 2019; Cheng et al. 2022a).

To estimate global integrals of OHC, algorithms are developed to grid temperature and/or OHC data, cope with data-sparse regions, and smooth the temporal and spatial fields. These algorithms are generally referred to as “mapping methods” and represent a leading source of uncertainty (Gregory et al. 2004; Boyer et al. 2016) in global OHC estimation, especially in data-sparse regions and eddy-rich regions with large spatiotemporal variability (Wang et al. 2018).

The ten in situ-based datasets used here (NCEI, LocalGP, IAP, Ishii, JAMSTEC, SIO, EN4, PMEL, ISAS and vS&LT) are listed in Table 1 and described in more detail in SI T1.

2.3 Geodetic Observations

The derivation of geodetic OHC is rooted in analysis of Earth’s sea level budget (Marti et al. 2022; Hakuba et al. 2021). To obtain global steric sea level change, global mean sea level (altimetry) and ocean mass change (gravimetry) observations are differenced, considering geophysical corrections such as related to glacial isostatic rebound effects (Caron et al. 2018). The steric change is translated into OHC and OHU using estimates of the ocean's expansion efficiency of heat. Full details on the geodetic OHC products used here (JPL and Legos, see Table 1) are provided in SI T1.

2.4 Ocean State Estimates and Reanalysis

Ocean reanalyses combine multiple data sources through data assimilation in a numerical model (Storto et al. 2019). The types of datasets being assimilated as well as the models and assimilation methods vary between reanalyses. In our comparison, we focus on the six datasets described in SI T1 (ECCOv4, OCCA2, SODA3, CIGAR, ORAS-5, Copernicus; Table 1), which continue to be improved and extended.

2.5 Earth Radiation Budget Data

CERES (Wielicki et al. 1996) currently flies multiple instruments on Terra, Aqua, S-NPP and NOAA20 satellite platforms, collecting and processing broadband shortwave (SW) and longwave (LW) radiances since March 2000. Here, we make use of the CERES Energy Balanced and Filled (EBAF) Ed4.2 product (Loeb et al. 2018), providing global solar incoming, and Earth outgoing net, shortwave (SW) and longwave (LW) radiative fluxes, as well as cloud properties (derived from MODIS and VIIRS radiances) at monthly and 1-degree spatial resolution. Detailed descriptions of the data products and a wide range of publications applying the data in climate analyses can be found here: https://ceres.larc.nasa.gov/. The solar irradiances are derived from time-varying instantaneous total solar irradiance measurements from various sources (Loeb et al. 2018). In our comparisons with OHU variability, we consider the period 01/2005–12/2020. In EBAF, the long-term mean EEI (global long-term mean net radiative flux) is adjusted to match with planetary heat uptake derived from largely in situ observations by applying an offset such that its mean value over the period 2005–2015 is consistent with the mean in situ estimate of 0.71 after Johnson et al. (2016). This offset correction anchors the satellite data to the in situ EEI estimate and does not affect the trend or interannual variability of the EBAF time series (see also Fig. 1); thus, temporal variations in radiometric EEI remain independent of those from the OHU data (Loeb et al. 2021).

3 Results

3.1 Global Ocean Heat Content and Heat Uptake Intercomparison

Figure 2 intercompares annual mean OHC time series, highlighting the two satellite-based products (a), the six series from reanalysis (b), the hybrid products that merge in situ with altimetry and other satellite information (c), and ten OHC time series based on in situ observations—four ingesting Argo data primarily (SIO, LocalGP, ISAS and JAMSTEC), and the remaining six using all available ocean temperature data (d). For reference, we also include planetary heat content derived from integrating CERES EBAF net radiative flux (black lines), which itself has been anchored to PMEL-combined OHU and non-oceanic heat uptake estimates and is therefore not independent. Figure 4a combines all OHC time series and provides a summary of the individual OHC long-term trends or OHU, respectively, including trend uncertainties expressed in W m−2 (Fig. 4c). The OHU estimates are normalized with respect to Earth’s entire surface area at the TOA, approximately 20 km above Earth’s surface: 5.14 × 1014 m2. The OHU derived from OLS trend analysis is comparable but not identical to OHU derived from differencing the last and first years of smoothed OHC time series (e.g., Cheng et al. 2022a, b; von Schuckmann et al. 2023) or deriving the mean OHU from the time-differenced dOHC/dt times series as in Fig. 3e–h (e.g., as done by Loeb et al. 2022). The trend estimate represents the annual rate of OHC change due to both natural and anthropogenic influences over the period of consideration, which means both forced secular and internal variability (e.g., ENSO) contribute to the change observed and ideally require separation. The observed rate of OHC change or OHU, respectively, has been acknowledged as a clear indication of continued warming of the ocean commensurate with observed increase in greenhouse gas emissions, partially compensated by direct aerosol effects, and their radiative forcing (Tokarska et al. 2019; Charles et al. 2020).

Fig. 2
figure 2

Annual mean OHC (ZJ) provided by 21 institutions and grouped by overall method and observing system: a satellite-based geodetic OHC estimates, b OHC time series from ocean reanalyses, c hybrid approaches, marrying in situ and altimetry observations (PMELc), machine learning (RFROM) and multi-platform analysis (CNR-ISMAR), d OHC derived from in situ observations, considering Argo-only (ISAS, SIO, LocalGP and JAMSTEC) profile data and all available in situ observations (EN4, IAP, Ishii, NCEI, PMEL, vS&LT)

Fig. 3
figure 3

Same as Fig. 2, but for annual mean ocean heat uptake (OHU) in W m−2

The full-column central OHU estimates over 2005–2020 derived from the 21 OHC products vary between 0.40 and 0.96 W m−2, indicating a significant spread with the lowest heating rate obtained from the ECCO ocean state estimate and the largest value derived from CIGAR. The two geodetic OHU estimates agree within uncertainties (0.89 W m−2), exceeding most estimates except for CIGAR. While the reanalysis systems SODA3 and ORAS-5 are in close agreement (0.65–0.67 W m−2), OHU estimates from ECCO (0.40 ± 0.12 W m−2) and CIGAR (0.96 ± 0.12 W m−2) are exceptionally low and high, respectively, representing the lower and upper limit of all estimates provided here. Although not as pronounced as for the reanalysis estimates, the spread across in situ estimates is substantial, from 0.51 (ISAS20 and EN4) to 0.74 (LocalGP) W m−2 (amounting to 45% of the lower value) and is significant considering the non-overlapping trend uncertainties at the 90% confidence level in Fig. 3c). In part, the discrepancies across in situ values are associated with different mapping/interpolation techniques, as well as decisions made in the quality control and bias correction of in situ profiles considered (Boyer et al. 2016; Cheng et al. 2016b2022b; Tan et al. 2023). Sampling considerations pertaining to the lack of ocean profiles in notoriously under-sampled areas of the ocean, e.g., shallow seas (> 300 m), polar, coastal and shelf areas, yield different coverage areas and ocean volumes considered in the OHU calculations across products, especially between products that use Argo data alone versus products that include profiles from gliders, ice-tethered profilers, XBTs and other in situ observations (e.g., von Schuckmann et al. 2014; Meyssignac et al. 2019; Abraham et al. 2013). Hakuba et al. (2021) found that Argo-only datasets produced larger OHU rates than Argo+other in situ products, which is not strictly the case in the present analysis; however, the primarily Argo-ingesting LocalGP and JAMSTEC products indeed reside at the upper range of in situ-based OHU estimates. Discrepancies in ocean area/volume sampled accounts for some of the spread in OHU across in situ datasets (see Sect. 3.2). Similarly, geodetic observations are constrained to ocean areas equatorward of ± 66° latitude given the availability of altimetry data (Marti et al. 2022). Thus, except for ocean reanalyses, and the IAP, Ishii, EN4 and NCEI data products, none of the observed OHC and OHU changes is truly representative of the full global ocean. In Sect. 3.2, we investigate the impact of applying a restrictive ocean sampling mask (Fig. S1) to a subset of twelve gridded OHC dataset and of a mask that limits the calculation of near-global OHC to the satellite sampling for insight on potential sampling uncertainties and their implications for the intercomparison.

From the time derivative (centered differences) dOHC/dt, we derive annual mean OHU time series (Figs. 3, 5, 6). OHU time series are expected to track year-to-year variability in global mean TOA net radiative flux on annual timescales, when the Earth system is near energetic equilibrium and year-to-year heat uptake variations in other Earth system components are assumed to be small (e.g., Loeb et al. 2012, 2021). We assess the agreement in detrended year-to-year variability by providing each products’ correlation coefficient with detrended CERES EBAF net radiative flux in the Taylor diagram (Figs. 4b, 5 sub-panel legends) together with each time series’ average amplitude expressed as the standard deviation of the time series. Correlation coefficients exceeding 0.44 are obtained for the PMEL-combined (0.47), RFROM (0.44), ORAS-5 reanalysis (0.47), ECCO ocean state estimate (0.55), JPL (0.62) and Legos geodetic OHU (0.46) time series. Correlation coefficients smaller than + 0.15 are found for the in situ datasets JAMSTEC, IAP, SIO, LocalGP, NCEI, EN4, vS&LT and the CIGAR and SODA reanalysis. Largest standard deviations exceeding the CERES variability by more than 0.1 W m−2 are found for the JPL geodetic OHU time series, PMEL in situ, NCEI, OCCA2, CIGAR and SIO datasets. Eight of the time series produce standard deviations smaller than in CERES EBAF (0.34 W m−2) at this annual timescale. The amplitude and trend in JPL geodetic OHU are sensitive to the estimate of expansion efficiency needed to translate derived steric sea level change to OHC and OHU; hence, the large amplitude in OHU variability in the JPL product is at least partially the result of a smaller expansion efficiency considered (derivation in Hakuba et al. 2021) compared to that in the Legos product (Marti et al. 2022). Further study is needed to find consensus on the magnitude and variability of this critical conversion factor in order to improve both global and regional OHC and OHU estimates from geodetic observations.

In terms of co-variability, it is evident that products exploiting temporal and spatial patterns of OHC anomalies and their local correlation with sea surface height (PMEL-combined), and with sea surface height and SST (RFROM), as well as the two satellite-based estimates and the ECCO and ORAS-5 reanalysis exhibit largest correlations (> 0.44) with CERES EBAF. All of these products either observe or fill in data-sparse regions based on additional physical knowledge to avoid relaxation of OHC maps back to a climatological mean (Durack et al. 2018; Cheng et al. 2019a, b; Lyman and Johnson 2023).

The co-variability of 13 OHU annual mean time series at 6-month increments centered mid-year (January–December) and end of year (July–June) akin to analysis in Loeb et al. (2021) is shown in Fig. 6 and shows that all correlation coefficients slightly increase except for the IAP and NCEI data, by up to + 0.19 (JAMSTEC). In line with Loeb et al. (2021), we find good agreement in both correlation and OHU trend between EBAF and the PMEL-combined OHU series and provide an update in the respective panel of Fig. 6. Again, largest correlations exceeding 0.4 are common for reanalysis, satellite and the SSH/SST-informed mapping methods, and are now even more pronounced (up to 0.78 for the JPL product).

In terms of OHU trends, or OHC accelerations (W m−2 dec−1, Figs. 4d, 5, 6), respectively, all products yield increase over the observational period, albeit some estimates are not significant at 90% confidence (Legos, ORAS-5, CNR-ISMAR, CIGAR, Fig. 3c). Trends near or below 0.25 W m−2 dec−1, and therefore less than 50% of the CERES trend, are obtained for the ORAS-5, CIGAR, Copernicus, vS&LT and JAMSTEC products. Central OHU trend estimates exceeding the CERES trend in net radiative flux originate from SIO, JPL, ECCO and OCCA2 products. Considering the OHU trend uncertainty derived by Loeb et al. (2021) at 0.50 ± 0.47 W m−2 dec−1 and by data product, these discrepancies are not significant at the 90% confidence level and indicative of accelerated ocean warming or increase in EEI, respectively.

Fig. 4
figure 4

a Ocean heat content (OHC) series (ZJ) derived from 21 data products. b Taylor diagram illustrating each detrended OHU time series’ standard deviation and correlation coefficient (R) with CERES net radiative flux over 2005–2019. Squares indicate a negative correlation. c OHC trend or OHU, respectively, including trend uncertainty derived from covariance matrix (α), annual OHC mapping standard errors or ensemble spread (β), and trend line residuals (γ) in terms of 90% and 68% confidence intervals using generalized least squares regression analysis. OHC trends for the JPL and Legos products are calculated three ways left to right using approaches α, β and γ, respectively. (d) Same as (c) but for OHC acceleration or OHU trend, respectively

Fig. 5
figure 5

Annual OHU anomaly time series (dOHC/dt from centered differences, long-term mean removed) derived from 21 OHC products and compared to CERES net radiative flux (black line). Green lines indicate the trend line through the OHU series. Trend magnitude and detrended correlation coefficient are provided in the legend of each sub-figure. Gray shading in the JPL sub-panel indicates the GRACE-FO data gap during which the annual means are based on less than 12 months of data

Fig. 6
figure 6

Same as Fig. 5 but for 13 products comparing annual means at 6-month increments centered mid-year (January–December) and end of year (July–June) akin to analysis in Loeb et al. (2021), their Fig. 1, and reproduced in the panel titled “PMELc.” Gray shading in the JPL sub-panel indicates the GRACE-FO data gap during which the annual means are based on less than 12 months of data

Applying a low-pass filter (Lanczos) with a cutoff period of 3 years as in Marti et al. (2022) removes high-frequency content related to intrinsic ocean variability (Palmer and McNeall 2014) and the mesoscale activity that is visible in altimetry but not in gravimetry, and improves the co-variability of the satellite-based Legos product with CERES net radiative flux (Fig. 7), from 0.62 (annual at 6-month increments) to 0.66. Different smoothing filters and their advantages are under investigation by multiple groups (e.g., Trenberth et al. 2016; Lyman and Johnson 2023; Marti et al. 2022), but their improvement of year-to-year variability, compared to CERES data, is not expected to exceed or fully meet the positive impact of more complete spatiotemporal sampling (e.g., geodetic) or regional filling (e.g., PMEL-combined, RFROM). However, both the “running average” at 6-month increments (Fig. 6) and the low-pass filter provide enhanced R coefficients, ascertaining that variations at timescales shorter than 1–3 years are non-representative of EEI variability, impede the direct comparison with CERES data and ought to be considered, minimized and better understood.

Fig. 7
figure 7

MOHeaCan (Legos) time series of OHU (blue) and 90% of CERES EBAF net radiative flux (black). Both time series are low-pass-filtered at three-year cutoff time (Lanczos) to remove high-frequency noise related to intrinsic ocean variability (Palmer and McNeall 2014)

Across the 21 different OHC products, we find (1) that CERES net flux and OHU year-to-year variability agrees remarkably well for the satellite-based, two reanalysis, as well as the in situ + satellite hybrid products, suggesting a key to agreement is as complete a spatiotemporal ocean coverage as possible; (2) that datasets that match CERES variability (R > 0.44) also agree with a positive trend similar in magnitude, reinforcing the notion of accelerated ocean warming and increase in EEI; (3) that spatial sampling considerations impact the validity of our intercomparison as well as the interpretation of global OHU estimates, OHU trends and correlations with CERES data (see Sect. 3.2); and (4) that smoothing short-term variability in OHU and running averages at the sub-annual scale improve the co-variability with CERES data.

3.2 How Gaps Impact Trend Estimates and Their Accuracy

3.2.1 Ocean Sampling Considerations

To investigate the impact of inconsistent ocean sampling across data products on OHU estimates, OHU trends and their correlations with CERES data, we subsample twelve gridded OHC datasets with the very same restrictive ocean sampling (ROS) mask (SI Fig. 1) prior to computing “global” OHU (orange dots in Fig. 8). Likewise, we apply a mask that limits the near-global OHC calculation to the satellite coverage (± 66° latitude, green dots) and compare the results to the unrestricted OHU estimates based on the original gridded data products and their native ocean coverage (blue dots, Fig. 8c). Applying the satellite mask, the OHC trends or OHU, respectively, are reduced in reanalysis products, by about 5% (0.04 W/m2; relative to SODA3 OHU), while the impact on in situ data, which in many cases do not exceed coverage beyond ± 60° latitude (except NCEI), is marginal as expected. For reanalyses that much depend on assimilated observations, uncertainties are largest in under-sampled areas, impeding on the interpretation of this result. In addition, a 5% sampling uncertainty is well within OHU uncertainty of most data products (Fig. 4). Subsampling with the ROS mask affects all estimates, reducing satellite-based OHU by 17% (0.15 W m−2; relative to Legos OHU) and OHU from reanalysis by similar amounts. In situ-based OHU is at most reduced by 16% for EN4 and by no more than 6% for the Argo-only ISAS20 product. For the in situ OHU estimates, after applying the same ROS mask, the spread across the seven products (EN4, IAP, ISAS, Ishii, JAMSTEC, SIO, NCEI) reduces by 30% from 0.51–0.71 to 0.44–0.60 W m−2, indicating that a common mask is required for the purpose of intercomparison and to reduce systematic differences in OHU across in situ products due to sampling considerations. The remaining discrepancies across Argo-only and across all in situ OHU estimates are dominated by differing source data, mapping techniques, varying baseline climatologies, quality control and bias correction procedures (Boyer et al. 2016; Schuckmann et al. 2014; Meyssignac et al. 2019). At the same time, this experiment highlights the need for as complete a global ocean coverage as possible to represent a true global estimate of OHU, such that no heat is “missed” by under-sampling.

Fig. 8
figure 8

Restrictive ocean sampling (ROS) mask experiment: a Percentage of Earth's surface covered by each gridded product (blue dots), after applying the satellite mask (± 66° latitude) (green dots) and after applying the ROS mask (orange dots). b Each product’s correlation of OHU time series with CERES net radiative flux using different masks, c same as (b) but for OHC trend or OHU, respectively. d Same as (b) and (c) but for the resulting OHU trend or OHC acceleration, respectively

Subsampling with the ROS mask, excluding shallow, coastal and marginal seas, also impacts the OHC acceleration or OHU trend, respectively, in W m−2 dec−1 (Fig. 8d), as well as the correlation between the OHU time series and CERES net radiative flux (Fig. 8b). The impact on the correlations is most evident for the spatially more complete satellite-based and reanalysis OHU series. For the satellite-based Legos product and ECCO reanalysis, the correlations decrease, potentially indicating the more complete coverage enhances the correlation with CERES; however, the correlations for SODA3 and ORAS-5 increase when applying the ROS mask. Sampling according to the ROS mask, increases the OHU trends for most products overall (except NCEI and ECCO), suggesting a larger OHU trend is observed when the ocean is sampled by the Argo system alone, and could potentially be slightly skewed toward sampling regions of faster warming (Fig. 8d). Masking out the polar regions poleward of ± 66° latitude affects neither correlations nor OHU trends significantly in any of the data products, including the reanalysis products, suggesting the polar OHU changes might, currently, play less of a role for tracking global OHU change and the co-variability with CERES EBAF.

Spatial sampling considerations clearly impact the spread between products in our intercomparison of OHC products, the magnitude of “global” OHU estimates, the OHU correlations with CERES net radiative flux and OHU trends in different ways, which require further investigation. Applying the ROS mask reduces spread across products slightly, enabling a more consistent comparison, while at the same time reducing agreement with CERES variability and the OHU magnitude. This suggests more complete ocean volume coverage is instrumental in capturing the “true” global OHU magnitude and change.

3.2.2 Gaps in OHC and EEI Time Series

Here, we investigate the impact of hypothetical data gaps in OHC and EEI climate data records to better understand what implications observing gaps might incur on trend analysis. Currently there is no plan for a follow-on Earth Radiation Budget mission post-Libera, which will be launched in 2027 on JPSS-4 and has a projected lifetime of 5 years (Harber et al. 2023; Hakuba et al. 2024). The in situ ocean observing system (e.g., Argo) is unlikely to suffer from sudden and complete interruptions that would create gaps in a global record, while satellite-based geodetic observations theoretically are.

We quantify the impact of artificial gaps on OHU estimates and EEI trend magnitude and uncertainty, omitting the role of measurement uncertainty. The analysis does not consider (time-varying) calibration uncertainties (e.g., accounted for by Loeb et al. 2009b; Wielicki et al. 2013) nor absolute calibration shifts in the data record after the gap (Loeb et al. 2009b). This analysis therefore solely demonstrates a lower margin of statistical uncertainty introduced by gaps of different lengths (1–25 months) and gap location in the data record.

Our starting point is monthly anomalies of ocean heat content (Fig. 9a) from NOAA NCEI (Levitus et al. 2012). We introduce a gap of at least 1 month and at most 25 months long in the OHC record, varying the gap starting point between the beginning and end of the time series. For each gap ensemble member, we calculate the OHC trend (OHU) and trend uncertainty (95% confidence level, CI95) expressed in Wm−2, which is 0.58 W m−2 ± 0.06 W m−2 for the gap-free record. Figure 9a shows the OHC time series including a 1-year gap (gray line). The red line indicates the linear regression performed to obtain the OHC trend or OHU, respectively, expressed in W m−2 with respect to the global surface area. Figure 9b scatters the trend bias in % for all gap-afflicted OHC series against the gap length and is colored by the gap starting year. With increasing gap length, the mean absolute bias (black line) as well as the maximum and minimum biases by gap length (gray envelope) increase, reaching a maximum bias at − 9% for a 2-year gap length placed at around 2020. The trend bias expressed in % of the original CI95 is shown in Fig. 9b. It appears that gaps placed toward the end of the record lead to negative biases, while largely positive biases occur with gaps at the beginning of the time series. This is probably related to the flatter increase in OHC at the beginning versus an accelerated increase toward the end of the OHC record. Similarly, but less linearly, the absolute bias in trend uncertainty (95% confidence interval) increases with gap length, reaching maxima near − 12% when placed near the middle of the record (Fig. 9d). Positive biases exceeding + 6% occur for gaps placed at the beginning of the record, a period of modest global OHC increase and variability.

Fig. 9
figure 9

a OHC monthly anomalies (NOAA NCEI), together with an example gap of 1-year length (black line) and the linear regression trend line (red). b OHC trend biases in percent of the original, gap-free trend magnitude (0.58 W m−2) as a function of gap length and gap starting year (color bar). The black line indicates the mean absolute trend bias, the gray line envelopes the maximum and minimum biases incurred by the gaps. c Same as (b) but for OHC trend biases in % of the original, gap-free trend uncertainty (95% confidence interval, CI95: ± 0.06 W m−2). d Same as (b) but for the trend uncertainty (CI95) bias in percent

Next, we examine trends in the monthly anomalies of global mean net radiative flux (CERES EBAF Ed.4.2). Same as above, we introduce a gap of at least 1 month and at most 25 months, varying the gap starting point between the beginning and end of the time series. For each gap of different starting times, we calculate the mean absolute bias, maximum and minimum biases in % of the gap-free trend (0.50 ± 0.15 W m−2 dec−1). We do the same for the trend uncertainty (CI95). Figure 10a shows the time series including a 6-month gap indicated by the gray line. The red line illustrates imputation of the missing values by linear interpolation as a form of uninformed gap filling. Figure 10b presents the trend biases by gap length with gaps omitted in the trend calculation (no fill, black line in Fig. 10a) and is colored by the gap location starting year. Omitting the gap appears to, on average, introduce trend bias of up to 3% for gaps up to 25 months long, the maximum trend bias incurred is up to 20% for a 25-month gap toward the end of the data record in 2021 (Fig. 10b). Clearly, the longer the gap, the larger the trend bias incurred. This relationship is more pronounced when the gaps are imputed by linear interpolation rather than being omitted (Fig. 10e). The linear interpolation completely disregards the true natural variability during the gap and appears to bias the trend even more, on average by up to 8% for a 25-month gap and at maximum by 54% for a 20-month gap at the beginning of the record. Note that the internal variability (in terms of standard deviation, not shown) is slightly larger in the first half of the record compared to the second, which might explain the sensitivity to gaps at the beginning of the record. Non-informed gap filling can worsen the situation, and gaps of 2 months or longer require more sophisticated, data and physics-informed imputation to not degrade the trend quality as much.

Fig. 10
figure 10

a CERES EBAF EEI monthly anomalies, together with an example gap of 6-month length (black line) and a linear interpolation line across the gap (red). b EEI trend biases in percent of the original, gap-free trend magnitude (0.50 W m−2) as a function of gap length and gap starting year (color bar). The black line indicates the mean absolute trend bias, the gray line envelopes the maximum and minimum biases incurred by the gaps. c Same as (b) but for EEI trend biases in % of the original, gap-free trend uncertainty (95% confidence interval, CI95: ± 0.15 W m−2). d Same as (b) but for the trend uncertainty (CI95) bias in percent. eg Same as (b), (c), (d) but for time series with gaps imputed using linear interpolation (e.g., red line in a)

Trend uncertainty (Fig. 10d, g) is sensitive to the gaps as well and shows a near-linear increase with gap length, with enhanced trend accuracy biases toward the end and beginning of the record. Likewise, uninformed gap imputation through linear interpolation increases bias in trend uncertainty even further.

The trend biases normalized by the trend uncertainty derived for the gap-free record (Fig. 10c) indicate that none of the trend biases is significant within trend uncertainty, but 2-year gaps placed toward the end of the record come close to, inducing trend bias of up to 70% of the trend uncertainty. As expected, for the imputed gaps of 15 months and longer, trend biases exceed trend uncertainty (Fig. 10f), implying significant impact of data gaps with uninformed imputation.

Assuming a gap is incurred because of measurement discontinuity or switch to a different measurement platform, then calibration uncertainty and differing instrument characteristics will certainly impede on the quality of trends without overlap or intercalibration capability (Loeb et al. 2009b), potentially making it impossible to tie time series together and derive meaningful trends. This analysis shows that a data gap in a record assumed to be free of measurement uncertainty and inhomogeneities can increase trend uncertainty significantly as a function of gap length and location alone. It also shows that gap imputation can worsen the impact of gaps if done improperly. The gap impacts are more pronounced when identifying trends in EEI from the CERES record, compared to the gap impacts on OHU from the OHC series. This is not surprising but indicates that the impact of gaps is also a function of signal-to-noise ratio; hence, it depends on how reliable the trend is to begin with. The rise in OHC is 20 times larger than the residual standard error, while the EEI trend exceeds the standard error by a factor of six only. The CERES team examined the feasibility of using less accurate imager retrievals to compute radiative fluxes and tie ERB time series before and after a data gap together (RBSP 2018). It was found that this “bridging” method yields uncertainty in net TOA flux that is too large to detect meaningful decade-to-decade changes in EEI. As of now, no comprehensive study exists on potentially suitable methods to bridge gaps and minimize their impact on satellite-derived EEI trends.

3.3 Zonal Trends in Net Radiative Flux

To identify regional patterns of change in net radiative flux (NET), we examine zonal distributions of + 20-year trends in NET, absorbed shortwave (SW) and absorbed longwave (LW) radiation. We compare observed global and area-weighted zonal trends in all-sky, clear-sky and cloud radiative effects (CRE), as well as in snow/ice cover and cloud properties. Studies that attribute observed radiative and EEI changes to geophysical processes, forcings and feedbacks require supplemental analysis of climate model experiments (e.g., Raghuraman et al. 2021), radiative kernel techniques (Kramer et al. 2021) and radiative perturbation analysis (Loeb et al. 2021, 2024).

In Fig. 11, we show area-weighted zonal trends (March 2000–June 2023, 1° latitude bands) in radiative flux anomalies normalized by the average zonal area. This way, the arithmetic average of area-weighted zonal trends equals the area-weighted global mean trend, enabling relative comparison of trends across latitudes and with the global mean. The radiation data as well as surface, atmospheric and cloud properties are taken from the most recent CERES EBAF Ed.4.2 and SYN Ed.4.1 data records. Figure 11a shows the area-weighted zonal trends for net radiative flux (NET) as well as the clear-sky (b) and cloud radiative effect (CRE, c) components. The all-sky NET trends are positive throughout all zones, implying the increase in EEI is evident across the globe. Zonal peak trends of 0.5 W m−2 dec−1 (global mean trend is 0.50 ± 0.15) or larger (orange dots), which are statistically significant at the 95% level (purple dots), occur in the deep tropics of the southern hemisphere (SH), tropics and subtropics of the northern hemisphere (NH), and at SH high latitudes poleward of 58° South. Figure 11b and c suggests the global mean NET trend (green line) is almost solely established through a positive NET clear-sky trend (0.48 ± 0.12), partially complemented by a small positive NET CRE trend (0.02 ± 0.13), and largely associated with change in absorbed clear-sky SW radiation (0.35 ± 0.10; Fig. 11e). Near both poles and the NH mid-latitudes, the zonal distribution of clear-sky SW trends appears to primarily correspond to decreases in snow and ice cover (Fig. 11k). Negative changes in aerosol optical depth (AOD, Fig. 11j) appear to add to the positive clear-sky SW trends between 15° and 40° North. Positive clear-sky LW trends are most pronounced in the tropics and shape the NET clear-sky changes in this region, aligning with the expected and regionally enhanced “super greenhouse effect” (Stephens et al. 2016; Raghuraman et al. 2019). Positive trends South of, and negative trends North of 30° N appear to largely cancel in the global mean, resulting in a global clear-sky absorbed LW trend at 0.12 ± 0.10 W m−2 dec−1.

Fig. 11
figure 11

Area-weighted zonal mean decadal trends in net radiative flux (NET, a), clear-sky NET (b) and NET cloud radiative effect (CRE, c). Decadal trends in shortwave absorbed flux (SW, d), clear-sky SW (e) and SW CRE (f). The same for longwave absorbed flux (LW, g), clear-sky LW (h) and LW CRE (i). Lastly, decadal trends in AOD (j), snow and ice cover (k), and cloud fraction (l). The purple dots indicate zonal mean trends that are significant at the 95% confidence level; the orange dots indicate absolute trends that are equal to or exceed the absolute global mean trend indicated by the green line. The gray shading indicates the zonal trends’ 95% confidence interval. Positive trends in a-i indicate increase in radiative absorption, i.e., contributing to positive net radiative trend

The near-zero global NET CRE trend originates from regional cancelations (Fig. 1c) with significant negative trends poleward of 60° North and South, and in the NH tropics that partially, but incompletely, compensate for positive clear-sky trends in these regions, most evidently in the NH polar region. The deep tropical “dip” in CRE trends could be associated with an observed narrowing and intensification of the tropical convection zone (e.g., Wodzicki and Rapp 2016, 2022; Su et al. 2017; Byrne et al. 2018). Significant positive NET CRE trends primarily occur in the NH mid-latitudes. The zonal distribution of NET CRE trends correlates more strongly with SW CRE trends (R = 0.93) than with LW CRE trends (R = − 0.69). On the global scale and by latitude, SW CRE (0.40 ± 0.13 W m−2 dec−1) and LW CRE (− 0.38 ± 0.06 W m−2 dec−1) trends nearly cancel each other, with the overall positive SW CRE dominating the NET CRE changes. Zonal trends in cloud fraction (global trend: − 0.20 ± 0.10 W m−2 dec−1, Fig. 11l) largely align with the zonal changes in NET CRE (R = − 0.36), but the zonal trends in cloud optical depth (COD, global trend: 0.02 ± 0.02 W m−2 dec−1, Fig. 12a) exhibit a higher correlation with zonal NET CRE trends (− 0.63), underscoring the relevance of cloud microphysical change in potentially driving the net radiative changes observed (see also Stephens et al. 2024, this issue). To further investigate the SW and LW CRE cancelation effects in the tropics, we contrast the positive NET CRE trends in the SH tropics (5°–20° South) with the negative CRE changes in the NH tropics (0°–15° North). Figure 12c–h shows three scatter plots per region, each exploring the relationship between LW and SW CRE regional trends, with the dots (regional trends) colored by either cloud fraction (CF, %), cloud optical depth (COD) and cloud top pressure (CTP) trends. It is evident for both tropical regions, even though the resulting zonal NET CRE trends are opposite, that the relationship between regional LW and SW CRE trends is very similar, with the CRE trends being negatively correlated (R = − 0.8) and dominated by SW CRE effects as indicated by the negative and smaller than 1 slope of the regression line. This means, on average, positive SW CRE trends tend to overcompensate for negative LW CRE trends, and negative SW CRE largely outweigh positive LW CRE trends. In both regions, positive SW CRE trends that outweigh negative LW CRE trends (orange triangle) are largely associated with a decrease in cloud cover, as well as decreased cloud thickness and height. Likewise, the negative SW CRE trends overcompensating for positive LW CRE trends (green triangle) are associated with an increase in cloud cover and clouds getting both thicker and higher. The latter appears to be the case for the deep tropical dip region which coincides with the mean location of the ITCZ (~ 12° North, e.g., Wodzicki and Rapp 2016). In the NH mid- and high latitudes, the inverse relationship between LW and SW CRE trends is not as pronounced (not shown). The overall positive NET CRE trends in the mid-latitudes is dominated by positive SW CRE trends associated with a decrease in cloud cover and cloud thickness. Likewise, negative NET CRE trends at high latitudes are mostly associated with negative SW CRE trends (not shown). The tropical cases suggest that the thinning and thickening of clouds plays an important role in modulating SW and NET CRE changes, which is further investigated for the tropics by Stephens et al. (2024). Zonal and regional NET CRE trends show no clear relationship with changes in cloud height (R = 0.02), but more so with cloud cover (R = − 0.36) and mostly with cloud thickness trends (R = − 0.63), further underpinning the role of SW CRE and the optical properties of clouds. It is furthermore interesting to note that the zonal distribution of trends in COD and AOD are very much alike (R = 0.8) which may suggest that indirect aerosol effects are shaping the changes in COD (e.g., Oreopoulos et al. 2020) and impact the zonal distribution of SW and NET CRE trends. While our trend analysis suggests SW clear-sky trends may be an important contribution to the EEI changes observed, radiative perturbation studies and model analysis find that significant positive contributions by cloud changes are of at least equal importance (Loeb et al. 2021; Raghuraman et al. 2021). Furthermore, Stephens et al. (2022) demonstrate that high-latitude surface changes play a subordinate role in modulating Earth’s reflectivity. We recognize the need for radiative perturbation analysis to adequately quantify the relative role of clear-sky and CRE processes driving change in zonal radiative flux. Assessing and intercomparing zonal and regional changes, not only in TOA radiation, but also OHU, is a key recommendation of GEWEX-EEI (Sect. 4).

Fig. 12
figure 12

a and b Same as in Fig. 11, but for cloud optical depth (a) and cloud top height (b). ce Scatter plots of LW versus SW CRE regional + 20-year trends in the tropical regions between 0° and 15° North, and are colored by trend in cloud fraction (c), cloud optical depth (d) and cloud top height (e). fh Same as ce but for tropical regions between 5° and 20° South

4 Discussion, Conclusions and Recommendations

Under the umbrella of the GEWEX-EEI assessment, we intercompare 21 OHC time series and OHU estimates (2005–2020) from various sources and institutions, as well as OHU trends and OHU correlations with CERES net radiative flux variability. The goals of this effort are a better understanding of discrepancies and their sources, and to forge a path toward best practices that may enable apples-to-apples comparison across methods, which ultimately improves our estimate of EEI and its uncertainties, a key indicator of global climate change.

Our study shows a significant spread in central OHU estimates (normalized to Earth’s TOA surface area) ranging from 0.51 to 0.74 W m−2 for in situ-based estimates, generally larger values from satellite-based OHC (0.89 W m−2), and 0.40–0.96 W m−2 across six ocean reanalyses. The in situ-based OHC products do not capture the deep ocean below 2000 m, and an estimate of 0.06 ± 0.04 W m−2 (Johnson et al. 2023b) for deep OHU has been added to be representative of the full ocean column. Using a subset of gridded OHC products, we demonstrate substantial influence of sampling considerations, with Argo-like sampling generally reducing OHU estimates from satellite and reanalysis data. Assumption of satellite sampling alone (± 66° latitude) reduces OHU estimates from reanalysis data only slightly. It is likely that more complete coverage as, for example, provided by satellite observations is beneficial in representing near-global OHU, but neither satellite dataset achieves full global coverage, missing OHU in the polar regions. Fully global data collection would require spacecrafts in polar orbit, as well as enhanced observing systems to more densely sample OHC below sea ice across the full seasonal cycle. Ocean models and reanalysis systems do sample all of the ocean, but are reliant on observational coverage as well, yielding larger uncertainty where direct observations are lacking (e.g., below 2000 m depth and in under-sampled regions; Storto et al. 2019). Clearly, the ocean coverage, both geographically and vertically, plays a substantial role in achieving apples-to-apples comparison, suggesting use of a common mask. However, in terms of portraying the global OHU adequately and its variability with respect to CERES data, it is more complete coverage that should be strived for.

Temporal variability in annual mean OHU at both 12- and 6-month increments is compared against CERES EBAF net radiative flux variability. It stands out that both satellite-based, two reanalyses and the in situ + satellite hybrid products, RFROM and PMEL-combined, exhibit correlations with CERES EBAF net radiative flux of 0.44 or larger, while most of the largely in situ-based OHU series do not agree as well. This may suggest that enhanced spatiotemporal sampling and physically informed regional filling is important for matching the interannual variability and trend found in CERES data.

For the in situ-based OHU fields, quality control choices for the observed data are a critical factor not only in reducing measurement error, but in reducing the representation error, the difference between a point source measurement and the wider spatial area represented in the gridded fields used for the OHU calculations. This also applies to reanalysis products, which rely on external (observations processing centers) and internal quality control procedures (see, e.g., Storto et al. 2016). The mapping method used to calculate uniform gridded fields from heterogeneous measurements in time and space is also crucial to addressing representation error. The representation error for temperature in the ocean is larger than the measurement error (Oke and Sakov 2008). The International Quality-controlled Ocean Database (IQuOD) project brings together the international community working on data rescue, quality control, bias correction and uncertainty quantification for ocean temperature data. The Argo program uniformly monitors measurements from deployed floats, assigning measurement uncertainties and quality flags (Wong et al. 2020). The outcomes of these community-wide efforts will result in a uniformly quality-controlled ocean temperature dataset homogenizing the measurement errors across different in situ-based OHU calculations.

The MapEval4OceanHeat project, an objective assessment of mapping methods used to estimate ocean heat content change, is aiming to improve our understanding of different interpolation/mapping techniques (including the uncertainty estimates some provide), by systematically applying different methods to the very same set of synthetic input data (sampled from a high-resolution model) and comparing the output maps with the actual model fields. While the project is ongoing, a protocol has been released (Giglio et al. 2023) describing which experiments are being conducted.

Deriving OHU from OHC data, and their resulting correlation with CERES, is impacted by assumptions such as the differencing method (e.g., first versus centered differences) and temporal sampling/smoothing (monthly vs. annual vs. low-pass-filtered). For example, most annual mean OHU time series show better agreement with CERES net radiative flux at 6-month intervals than at 12-month intervals. A smoothing filter as applied by Marti et al. (2022) improves the correlations with the Legos product, by suppressing interannual variations in atmospheric heat storage and energy divergence at time resolutions smaller than 2–3 years (Palmer and McNeall 2014). The EEI community has recognized that even though global long-term mean atmospheric heat storage is comparably small, it exhibits significant interannual variability and, in combination with OHU, improves the co-variability with CERES EEI substantially at seasonal timescales (Johnson et al. 2023a, this issue).

A major concern for satellite-derived OHC products is the adequate knowledge of seawater’s expansion efficiency of heat, which acts as a scaling factor for OHU variations and significantly affects the magnitude of internal variability and trend in OHU. Hakuba et al. (2021) derived an inverse efficiency of 0.52 ± 0.065 W m−2 mm year−1 which is significantly larger than the 0.43 W m−2 mm year−1 derived by Marti et al. (2022). We recommend reassessing the magnitude, variability and uncertainty of global, regional and ocean profile efficiencies, also in light of producing adequate satellite-based regional OHC estimates that would significantly add to our understanding of regional discrepancies and variations in OHC and OHU. For example, Hakuba et al. (2021) found the largest basin-wide discrepancy in steric sea level compared to in situ data in the Indo-Pacific, which has yet to be explained and assessed in terms of OHU. Regional assessments would furthermore help to address a sudden dip in OHU near the year 2016 which is evident in many of the in situ-based OHU time series and yet to be understood.

There is substantial spread not only in OHU central estimates, but the OHU error bars as well (Fig. 4). For example, uncertainties derived for geodetic OHU exceed the error bars derived for in situ-based OHU. While the geodetic community has partially estimated and combined major uncertainties in altimetry and gravimetry observations (e.g., Blazquez et al. 2018; Ablain et al. 2019), the uncertainties for in situ OHC are often not fully quantified or combined. For example, uncertainties due to instrument bias correction or sampling considerations would ideally flow into comprehensive OHU uncertainty estimates. As many before them, Meyssignac et al. (2019) determined OHC trend uncertainties from trend residuals alone, accounting for lag-1 autocorrelation, but state that the trend uncertainties ought to be at least doubled to account for sampling uncertainty. We therefore recommend establishing a roadmap toward comprehensive and consistent uncertainty estimation for OHC, OHU and EEI trend estimates, due to all (known) sources of error and their co-variances, across all approaches discussed—in situ, reanalysis, satellite and hybrid approaches.

Estimating EEI via heat inventory analysis is to date the most viable approach to closing the Earth’s energy budget and largely possible due to the unprecedented coverage of the ocean by Argo floats. It is therefore critical to maintain and expand the ocean observing system to improve coverage geographically and vertically into the deepest and notoriously under-sampled layers of the ocean, which Deep Argo regional pilot arrays have demonstrated is feasible (e.g., Jayne et al. 2017), although funding is not yet identified for a global expansion. Additional avenues to monitor EEI independently and directly from space at the TOA are under investigation by several groups around the world (e.g., Schifano et al. 2022; Hakuba et al. 2019), but the feasibility to achieve such unprecedented accuracy requires further study. Such missions would provide additional rapport on EEI magnitude and change from year to year.

The study of zonal trends in net radiation with CERES data has revealed several key regions of change across the tropics, subtropics and at high latitudes. The analysis suggests a primary contribution to positive EEI trends from absorbed SW radiation, in line with previous studies (e.g., Loeb et al. 2021; Stephens et al. 2022). Although SW accumulation is a leading factor, it has to be noted that climate model, kernel and radiative perturbation analyses are consistent in pointing out that LW forcing, although masked by competing climate responses, is the fundamental cause for initiating positive and negative climate responses and feedbacks (Raghuraman et al. 2021; Kramer et al. 2021), which on the one hand mute LW absorption and on the other hand amplify energy uptake driven by positive SW changes. Continued investigation of EEI change and its causes is needed to resolve any disparate conclusions on potential drivers of EEI change and variability (e.g., Stephens et al. 2022; Loeb et al. 2021, 2024; Raghuraman et al. 2021; Kramer et al. 2021).

The gap impact analysis performed ignores the role of measurement uncertainties and potential shifts between non-overlapping parts of a data record, and requires further investigation to include observing system characteristics. The analysis reveals that gaps of any length (between 1 and 25 months) can have a significant impact on deduced trend magnitude and uncertainty, depending on the location of the gap in the data record. The impact is larger for EEI than OHC trends, given the more linear and robust increase in OHC, while EEI trends are more sensitive to the period considered and the interannual variability that substantially shapes the > 20-year EEI record. The gap impacts on trend are at the lower end of impact expected, ignoring intercalibration and bridging issues that would make trend detection likely infeasible for years to come (e.g., Loeb et al. 2009a). It is therefore of utmost importance to prevent data gaps, investigate “bridging” methods and ensure seamless monitoring of EEI change well into the future. Studies to explore observing system requirements, ways to meet them, critical ancillary information and the role of data gaps are likely of great relevance toward a maintained and improved EEI monitoring system and research framework (KISS Continuity Study Team 2024).

The four approaches for estimating OHC and OHU as well as the radiometric observations of TOA radiative flux intercompared here represent the core of today’s EEI monitoring system, whereby in situ OHC has been the most vital for constraining EEI magnitude and, together with reanalysis, provides unparalleled insight into the distribution of heat across the ocean volume. Geodetic satellite observations provide enhanced spatiotemporal coverage estimating full-column OHC and confirm the positive EEI trend derived from TOA radiation measurements. Scanning radiometry not only provides insight into EEI variability at high precision and stability but allows us to study the radiative processes that perturb EEI, e.g., the role of cloud and aerosol radiative effects.

The first GEWEX-EEI Assessment Workshop held in spring 2023, yielded recommendations (Meyssignac et al. 2023b) that in part have been touched upon in this paper and are summarized as follows:

  1. 1.

    Discrepancies among EEI and OHU products, methods and their origin, ought to be systematically assessed and improved upon.

  2. 2.

    Regional, zonal and basin-scale intercomparisons are recommended to better understand global discrepancies and the impact of differing ocean volumes sampled. With respect to regional geodetic OHC analysis, in-depth assessment of expansion efficiency is required.

  3. 3.

    Best practices to enable apples-to-apples comparison—e.g., sampling considerations, uncertainty quantification, OHU derivation—ought to be established and shared with the community.

  4. 4.

    Beyond improving our knowledge of EEI with existing observations, ensuring seamless continuity of these systems and data products should be a priority, as well as efforts to expand those for improved coverage of the ocean, land and cryosphere (see also von Schuckmann et al. 2023).

  5. 5.

    Novel techniques ought to be explored to provide independent and direct measurements of EEI at the TOA.

  6. 6.

    Understanding EEI changes and their attribution is as important as the comprehensive quantification and characterization of EEI and its variability.

The next steps for the GEWEX-EEI assessment involve the intercomparison of OHC and OHU trends at the regional scale and establishing comprehensive and consistent uncertainty quantification of at least the most dominant error sources. While all methods intercompared agree on positive OHU values between 0.40 and 0.96 W m−2, higher confidence and temporal precision is needed to comfortably track changes from year to year, especially if climate mitigation strategies and their impact are to be monitored and understood.