1 Introduction

Extreme weather events, such as heavy precipitation, hail storms, heat waves, droughts and damaging wind gusts have a strong and detrimental impact on East African societies. The Lake Victoria Basin (LVB) and its surroundings are especially vulnerable to extreme events, as demonstrated by devastating flash floods caused by intense rainfall in the mountainous region (Jacobs et al. 2016; Wainwright et al. 2021). The population around Lake Victoria is affected by extreme weather both on land, where flooding regularly occurs (Lwasa 2010), and on the lake, where nightly storms often catch fishermen by surprise (Semazzi 2011). This situation is expected to aggravate in future decades, both due to the growing population (Vermeiren et al. 2012) and the projected intensification of extreme precipitation over the lake (Thiery et al. 2016; Finney et al. 2020a; Van de Walle et al. 2021), thus underlining the risks associated with climate change in LVB. It is therefore of key importance to investigate how extreme weather events will evolve in future decades in the LVB and to provide improved information to the impact community. This is the overall aim of ELVIC (climate Extremes in the Lake VICtoria basin), carried out under the framework of the World Climate Research Program’s (WCRP) Coordinated Regional Downscaling Experiments Flagship Pilot Studies program (CORDEX-FPS).

Climate modeling in the LVB and surroundings is especially challenging due to the complex orography and the presence of larger and smaller lakes with irregular coastlines. These lakes strongly influence the mesoscale atmospheric dynamics due to the thermal contrast between the inert lake and the quickly responding land, facilitating convection over the lake during nighttime and inhibiting it during daytime (Anyah et al. 2006; Thiery et al. 2015; Woodhams et al. 2019). Storms occurring on land affect humidity, temperature and atmospheric dynamics in the boundary layer, and are therefore forerunners for strong nightly storms over the lake (Thiery et al. 2016, 2017; Woodhams et al. 2018). Moreover, the mountains around Lake Victoria exert a strong control on precipitation characteristics by inducing anabatic and katabatic winds (Anyah et al. 2006). On the larger scale, the region is characterized by lower troposheric easterlies which transport moisture to the LVB (Anyah et al. 2006). The mountains on the eastern side of the basin strongly modulate these large-scale winds with distinct different precipitation patterns in the basin during flow-over conditions compared to blocked conditions (Van de Walle et al. 2020). Similarly, westerly anomalies can bring moisture from the Congo Basin over the mountains to the west of the lake enhancing LVB rainfall (Finney et al. 2020b). A realistic representation of the lake and mountains is therefore of key importance for correctly modelling meso-scale circulation, associated convection patterns and amount and timing of precipitation in the LVB.

The complex orography and the lakes have a substantial impact on projected future East African precipitation due to local and mesoscale feedbacks (Souverijns et al. 2016). Changing mesoscale dynamics associated with a larger thermal inertia of the lake are expected to affect lake precipitation (Thiery et al. 2016; Finney et al. 2020a). But also changes in large-scale atmospheric dynamics play a role when synoptic systems enhance moisture convergence over equatorial East Africa (Cook et al. 2020). A future increase in extremes over the lake has been mainly attributed to a higher moisture content (Thiery et al. 2016; Finney et al. 2020a). In addition, the projected weakening of mesoscale circulation is compensated for by stronger thunderstorm dynamics (Van de Walle et al. 2021), which is widespread over tropical Africa under climate change (Jackson et al. 2020). The importance of an accurate representation of mesoscale circulation and deep convection for future climate projections thus is evident. An increase in model resolution to the kilometer scale is an obvious approach towards these aims.

When decreasing horizontal grid spacing to about 4 km or less, deep convection is crudely resolved by the model and the parametrization of this process can be switched off (Prein et al. 2015). Recent Convection-Permitting (CP) model simulations over East Africa have shown to substantially improve model performance for rainfall intensity, rainfall diurnal cycle, storm propagation and dry spells (Van de Walle et al. 2020; Finney et al. 2019, 2020a; Jackson et al. 2020; Senior et al. 2021; Misiani et al. 2020; Woodhams et al. 2018)). Moreover, Van de Walle et al. (2020) found a much better representation of top-of-atmosphere radiation in their COSMO-CLM CP model simulation compared to the CORDEX-Africa COSMO-CLM simulations (Kothe et al. 2014) and linked this both to higher resolution and to the improved microphysics (2-moment scheme). Moreover, improvements in continental-scale circulation and regional rainfall accumulations, such as a reduction in the persistent dry bias in West Africa, are found in a CP pan-African UKMO simulation (Stratton et al. 2018; Senior et al. 2021). The same model demonstrates an effect of the CP scale on the intra-annual precipitation cycle in East Africa, both in terms of biases as well as the climate change signal (Wainwright et al. 2021). CP models open avenues to gain a new level of understanding on meteorological processes that was not possible before, e.g., for lightning formation that relies on an accurate representation of convection and cloud ice (Finney et al. 2020c), to investigate the climate impact of recent urban expansion (Brousse et al. 2020), to improve our understanding of regional wind dynamics (Docquier et al. 2016), or to explore the meteorological drivers of the flow patterns in east African lakes by coupling to a 3D hydrodynamic model (Kranenburg et al. 2020).

Although individual studies demonstrate improvements in CP models for East Africa, this was never systematically tested in a coordinated regional climate modeling framework. Recent multi-model ensemble studies with CP models emphasize that reliance on individual models or simulations may produce misleading results (Ban et al. 2021), that an ensemble based approach provides added value for investigating high impact convective processes (Coppola et al. 2020) and that it can help to improve climate services when coupled to local impact models (Termonia et al. 2018; Senior et al. 2021). Therefore this paper assesses the added value of CP models on the representation of deep convective systems and precipitation in Equatorial Africa. For this purpose, 10 year present-day model simulations were carried out with five regional climate models at both parameterized and convection-permitting scales, namely COSMO-CLM, RegCM, ALADIN/AROME, WRF and UKMO. This model evaluation is an important first step toward making convection-permitting model simulations a routine component of climate projections for East Africa, as advocated by (Senior et al. 2021).

2 Methods

2.1 Overview of the study area

The Lake Victoria Basin (LVB), which is the geographic area draining into the lake, is located between two arms of the East African great Rift Valley system in a shallow continental sag (Anyah et al. 2006) (Fig. 1). The basin extends in six countries, namely Burundi, the Democratic Republic of the Congo, Rwanda, Kenya, Tanzania and Uganda. The latter three countries share Lake Victoria, the largest lake in Africa (68,800 km\(^2\)) and the main reservoir of the Nile river (Herschy 2012). The lake is surrounded by some of the highest mountains in Africa in the western and eastern rifts. The complex orography in combination with the lake underlines the need for high resolution modelling.

Fig. 1
figure 1

Size of the evaluation domain, the orography (colors) and the contours of the Lake Victoria basin (thick black line)

2.2 General model setup

A protocol was established to harmonize the model integrations: a 10 year period (2006–2015) was chosen, plus the year 2005 as spin-up. The lateral boundary conditions are from ERA-Interim (Dee et al. 2011) since not all institutes had the possibility to use ERA-5. It was recommended to have a (1-dimensional) lake model integrated in the atmospheric model. All institutes performed integrations at the 10 kilometer scale using PARameterized (PAR) convection as well as the kilometer scale or Convection-Permitting (CP) scale.

The protocol proposes a double nesting approach in which the output of the 10 kilometer scale domain is used to force the CP model. The PAR model domain is about 2000x2500 km\(^2\) large and covers six countries mentioned above. The CP domain is about 850x1250 km\(^2\) large. It is centered over Lake Victoria and extends well beyond the basin. The evaluation domain, shown in Fig. 1, is 80 km smaller on each side than the modelling domain specified in the protocol. Even though this evaluation domain is larger than the basin, for simplicity we refer to this area as the Lake Victoria Basin (LVB) throughout the manuscript.

Although we intended to streamline the model integrations with this protocol, in practice models differ in their setup, in the time period for which the simulations were performed and in the lateral boundary forcing. Some of the model integrations were used within other projects for which different requirements were set, which were sometimes incompatible with the ELVIC protocol. Notably the UKMO integrations, which were already performed before the design of the protocol, cover the entire African continent. Moreover, COSMO-CLM was forced with ERA5 instead of ERA-Interim. Since limited resources were available for this computationally expensive research for Africa, we choose to follow an inclusive approach and allow for deviations from the protocol.

The CP and PAR models differ not only in their resolution, but also in terms of the parameterisations that are used. The Weather Research and Forecast model (WRF; Skamarock et al. (2008)) applies a grid spacing of 3.1 km in the CP (referred to as WRF-3.1) and 12 km in PAR (WRF-12). Besides horizontal resolution, the coarse and fine resolution model setups differ in vertical resolution (resp. 35 and 69 vertical layers). Deep convection parameterization is switched off in the CP run. The rest of the physical parametrizations is equal between PAR and CP.

ALADIN and AROME are part of the HARMONIE-Climate (HCLIM38) regional climate modelling system designed for a range of horizontal resolutions from tens of kilometres to convection-permitting scales of 1–3 km (Lindstedt et al. 2015). The PAR model (HCLIM38-ALADIN referred to as ALADIN-12) uses a grid spacing of 12.5 km while the CP model (HCLIM38-AROME referred to as AROME-2.5) uses a grid spacing of 2.5 km. The CP simulation uses a non-hydrostatic dynamical core (Benard et al. 2010), while the PAR simulation has a hydrostatic core (Temperton et al. 2001). The CP simulation does not employ a deep convection parameterization, while the shallow convection is parameterized following De Rooy and Siebesma (2008). The PAR simulation uses both deep (Bougeault 1985) and shallow convection parameterizations (Bazile et al. 2011). The turbulence parameterizations are somewhat different, with the CP simulation following Lenderink and Holtslag (2004) while the PAR uses Cuxart et al. (2000). Finally, the orographic wave drag is used only in the PAR simulations and follows Catry et al. (2008).

The United Kingdom Met Office (UKMO) Unified Model (Clark et al. 2016) CP simulation (UKMO-4.5) has a 4.5 km horizontal grid at the equator with 80 vertical levels, while the PAR simulation (UKMO-25) has a 25 km grid with 63 vertical levels. The remaining differences in model set-up are primarily because some assumptions are not valid at km-scales: UKMO-25 has a one dimensional boundary layer scheme (Lock 2001) while UKMO-4.5, also referred to as CP4A, has a three dimensional boundary-layer scheme (Boutle et al. 2014). Moreover, a prognostic cloud fraction and condensation scheme (Wilson et al. 2008) is used in UKMO-25 whereas UKMO-4.5 uses the diagnostic Smith (1990) scheme. A moisture conservation scheme (Aranami et al. 2015) is implemented in UKMO-4.5 and not in UKMO-25, partly because these errors are supposedly larger in CP models, reducing unrealistically strong precipitation intensities due to transport errors from the semi-lagrangian scheme. More details can be found in Stratton et al. (2018).

For COSMO-CLM version 5 (Rockel et al. 2008), apart from the difference in horizontal resolution (12 km for COSMO-CLM-12 and 2.8 km, COSMO-CLM-2.8), the differences between CP and PAR are a different vertical resolution (resp. 35 and 70 vertical layers), an activation of the subgrid-scale orography scheme, and subgrid-scale deep convection in PAR only. Though subgrid-scale deep convection parameterization is switched off, shallow convection is still parameterized in PAR following (Tiedtke 1989).

The RegCM version 4.7.1 (Giorgi et al. 2012) was run at a grid-spacing of 25 km and 23 vertical levels (RegCM-25), and at 3 km grid-spacing and 41 vertical levels (RegCM-3.0). The RegCM-3.0 utilizes a lake model following Hostetler et al. 1993 and Bennington et al. 2014, while in the RegCM-25 simulations this is turned off. The RegCM-25 and RegCM-3.0 simulations use the SUBEX (Pal et al. 2000), and WRF Single-Moment 5-class (Hong et al. 2004) cloud microphysics schemes respectively. The coarse RegCM-25 simulation uses a Tiedtke convective parameterization over land points (Tiedtke 1996) and a Kain-Fritsch parameterization over ocean points (Kain and Fritsch 1990; Kain 2004). While no large-scale convective adjustment scheme is utilized for these simulations, a shallow-convective adjustment scheme (Tiedtke 1996) is retained to account for subgrid-3km shallow convective processes. Other differences between the PAR and CP simulations include the use of a non-hydrostatic core in RegCM-3.0, while RegCM-25 uses a hydrostatic core. A detailed description of all five models can be found in Section 5.1.

2.3 Observational datasets

The model performance is assessed against observational products for precipitation, top-of-atmosphere radiation and lake surface temperature, mostly satellite-derived. The set of seven precipitation products contains Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis version 7 (TMPA v7, with resulting product 3B42.7, Kummerow et al. 2000), Integrated Multi-satellite Retrievals for Global Precipitation Measurement (IMERG v6, Huffman et al. 2015, 2020), Multi-Source Weighted-Ensemble Precipitation version 2 (MSWEP v2, Beck et al. 2019), Tropical Applications of Meteorology using Satellite data and ground-based observations (TAMSAT, Maidment et al. 2017), Climate prediction center morphing method (CMORPH, Joyce et al. 2004), Climate Hazards group InfraRed Precipitation with Station data (CHIRPS, Funk et al. 2015) and Global Satellite Mapping of Precipitation version 6 (GSMaP v6, Ushio et al. 2009). Even though rainfall over the lake is not well known, Nicholson et al. (2021) and Ageet et al. (2022) indicate that IMERG very likely overestimates rainfall and that TMPA has issues with gauge calibrations over the lake. However, thanks to this large set, individual product uncertainty is reduced and information about observational spread is included.

Both short- and longwave top-of-atmosphere radiation are evaluated against three observational products: Clouds and the Earth’s Radiant Energy System (CERES, Wielicki et al. 1996), Geostationary Earth Radiation Budget (GERB, Harries et al. 2005; Clerbaux et al. 2009) as well as Meteosat Visible and InfraRed Imager (MVIRI) onboard the Meteosat First Generation (MFG) and the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) onboard the Meteosat Second Generation (MSG, Schmetz et al. 2002) satellites (MVIRI/SEVIRI, Urbain et al. 2017). The latter two have unprecedented temporal (30 and 15 min resp.) and spatial (\(\sim\)1 km at nadir) resolution, but lack data after April 2015.

Satellite remote sensing can accurately represent in-situ observed lake surface temperature patterns (Thiery et al. 2015). As a consequence, remote sensing is increasingly used to monitor their spatio-temporal patterns (Woolway and Merchant 2019; Woolway et al. 2020). Although persistent cloud cover impedes high-frequency lake surface temperature monitoring in the tropics, the multi-decadal availability of data enables the evaluation of the model climatology. Here, observational data is obtained from the ARC-Lake dataset (MacCallum and Merchant 2012). Due to the limited time overlap between the ARC-Lake product (1996–2012) and the simulation period (2006–2015), the full ARC-Lake climatology is used to evaluate the spatial heterogeneity and annual cycle of lake surface temperature. The assumption of minor changes between both periods is supported by the small interannual variations within the observation period.

All gridded datasets are interpolated to match the high-resolution model grid, making use of second-order conservative (for precipitation) or a bilinear (for radiation and temperature) remapping technique. Temporal resolution differences in precipitation products are taken into account by rejecting the coarsest resolution CHIRPS (6h) and TAMSAT (1d) products from sub-daily analyses and interpolating 3 hourly products with cubic splines.

2.4 Evaluation metrics

This paper focuses on the representation of precipitation and top-of-the-atmosphere radiation, which are both expected to be affected when going to the convection permitting scale as well as being available from satellite observations. After all, the representation of clouds is expected to be sensitive to the parameterisation of convection (Brisson et al. 2016), but little observational data is available. Therefore, top-of-the-atmosphere radiation, which is strongly affected by the representation of clouds but has little observational uncertainty, is a good alternative. Due to the double penalty problem of a location based analysis of timeseries (Wernli et al. 2008, Van Weverberg et al. 2010), the focus is on statistical metrics like spatial distribution of temporal averages, spatial and temporal average annual and daily cycle and statistical distributions of spatio-temporally pooled data. Precentiles are calculated not conditional on the occurrence of rain. Moreover several key parameters were evaluated, which are expected to be affected when going to the convection permitting scale, namely the amplitude and peak local time of radiation and precipitation, the latter being known to be too late in models that parameterise convection. Moreover models that parameterise convection are well known to overestimate drizzle and therefore the total number of rainy events is also a key evaluation parameter. Lake temperatures, including the spatial distribution, were analysed in an attempt to find the underlying causes for deficiencies in precipitation. Due to the sparse observational data and the dependencies on satellite products tending to underestimate extreme precipitation (Ageet et al. 2022), the analysis of extreme precipitation should be interpret which caution. An overview of all evaluation metrics is given in Table1.

Table 1 Evaluation metrics used in this paper and applied to precipitation (P), top-of-the-atmosphere upward longwave radiation \(LW\uparrow _{TOA}\), top-of-the-atmosphere upward shortwave radiation \(SW\uparrow _{TOA}\) and lake surface temperature (\(T_l\)). Crosses indicate which evaluation metric is used for each variable

We examine evaluation metrics in Table1 for the different models at both PAR and CP scales compared to the observational datasets. Given the considerable differences between the observational datasets (e.g., IMERG versus CHIRPS for precipitation, see Fig. S1), both an observational mean and range are determined, to which the model output is compared. For figures showing the deviations of the models from the observed data, areas that fall within the observational range are masked out. This paper focuses on robust signals among the different models and not on performance and improvements needed for the individual models.

3 Results

3.1 Precipitation: bias, annual and daily cycle

When comparing deviations of the annual precipitation for PAR and CP models from the observed range, generally an overestimation of precipitation in the Lake Victoria basin and its surroundings, particularly for the lake area and regions of complex orography are detectable (Fig. 2). However, the details are quite diverse among models: for example, while ALADIN/AROME shows an improvement of rainfall amounts in the CP compared to the PAR model, the opposite is true for UKMO, and for the RegCM the anomalies partially change sign. Thus, we identify no obvious general improvement of CP models compared to PAR models for rainfall totals (see ensemble mean). Moreover, the multimodel ensemble mean outperforms the results of individual models, a result also found by Endris et al. (2013). We also note that these biases are significantly larger than interdecadal variability in rainfall totals, illustrated by comparing Fig. 2 with the difference in observed rainfall between the time period simulated by the Met Office model and that of other models (Fig. S14). In other words, the sensitivity of results to this difference in the time period is minimal. Note that an overview of all models and precipitation product can be found in Figs. S1 and S2.

The performance of the models is next assessed by comparing to the annual cycle of precipitation for land versus lake areas (Fig. 3). Similarly to Endris et al. (2013), the models realistically represent the seasonal cycle associated with the overpass of the Intertropical Convergence Zone resulting in a bimodal annual cycle. For land areas (Fig. 3a), the annual cycle of the individual models (colored lines) is largely within the envelope of the observational datasets (gray area). For April and November, the rainiest months, most CP models show higher precipitation totals than PAR models. The situation is quite different over the lake area (Fig. 3b), where several models have more difficulty representing the annual cycle (e.g., relatively flat annual cycle in WRF) and often considerably exceed the observational range (note also different scale). This occurs not only in the rainiest months but also at other times, particularly between June and September. In general, similarly to Endris et al. (2013), the sign of the seasonal average bias varies between the PAR models. Moreover, there is no obvious improvement or deterioration in annual cycle when comparing CP versus PAR models to observations.

In contrast, promising results are identified when analysing the precipitation biases in terms of the diurnal cycle (Fig. 4). Similarly to Nikulin et al. (2012) precipitation in PAR is triggered too early during the diurnal cycle. More specifically, for the land areas (Fig. 4a; Table 2), PAR models generally show the daily peak too early at 13:46 local time (Greenwich Mean Time + 3h) compared to the observational range (gray area at 18:07 local time). The rainfall peaks are delayed for several CP models (e.g., COSMO-CLM, AROME, UKMO), leading to a better average agreement with the observations (16:42 local time). Note that the average rainfall peak value (mm/h) is often larger than all observational products (Fig. 4b). For the lake area, the observed rainfall peak in the morning (7:23 local time) is generally well represented by the models (Fig. 4c), though rainfall values are often overestimated, particularly for the daily maximum in hourly precipitation (Fig. 4d). Three of the five models show a slight delay in the morning precipitation maximum over the lake.

Over land, peak local time of precipitation generally occurs during the late afternoon and evening (Fig. 5). The ensemble mean land precipitation in PAR models is 4.4 hours too early especially east of the lake and in the northern part of the domain, although COSMO-CLM deviates from this pattern with a delayed precipitation daily maximum east of the lake, partly compensated by too early precipitation peaks elsewhere (Fig. S4). In all models these biases are much alleviated when going to the CP scale, with the timing of the daily maximum in land precipitation systematically delayed in CP compared to PAR models by almost three hours, thereby reducing the bias in timing by 67%. Substantial biases in PAR models are present on the shores of the lake where most people live, where the ensemble mean is out of phase with the observed cycle (Fig. 5). This region benefits from improvements in peak time of precipitation when going to the CP scale which might be related to improved representation of the convective process but also to the improved representation of the coastline. Over the lake, precipitation peaks during the night in the west of the lake with a gradient toward the east where it peaks during the early morning. Over the lake, the timing of the precipitation is quite well represented in both PAR and CP models (Table 2), however slightly too early in the west of the lake. In general terms, we conclude that CP models are generally in better agreement with the observations due to the delayed rainfall maxima compared to PAR models, particularly over land areas.

Fig. 2
figure 2

Averaged total rainfall bias for all parametrised and convection-permitting models as well as the ensemble mean. As for all maps, the bias is calculated against the observational mean per pixel, while model values within the observational range are masked white. Absolute rainfall amounts for all observational products and individual models are provided in Fig. S1 and S2 respectively

Fig. 3
figure 3

Land and lake seasonal cycles of rainfall for all parametrised and convection-permitting models compared to the observational band

Fig. 4
figure 4

Land and lake diurnal cycles of rainfall for all parametrised and convection-permitting models compared to the observational band in the left subplots. Corresponding rainfall peaks and peak times are indicated in the right subplots, while arrows connect the information for the parametrised (dots) and convection-permitting (stars) simulations

Table 2 Precipitation metrics over the 10 year period averaged over lake, land and the full evaluation domain namely total number of rainy events per year (nre), peak local time of precipitation (\(t_{peak}\)), 90th, 95th, 99th precipitation percentile (P90, P95 and P99). All metrics are calculated based on the ensemble-mean (EM) of all parameterized (PAR) and convection-permitting (CP) simulations separately

3.2 Precipitation: distribution and extremes

Fig. 6 shows that rainfall events are observed to be more frequent to the west, closer to the Congo basin, over the mountains, and northeast of the lake. The ensemble mean of the PAR models, as expected from past studies, shows far too many rainfall events almost everywhere, with a multi-model mean bias over the entire evaluation domain of 313 events per year. The CP ensemble alleviates this to a multi-model bias of -65 events per year (Table2), although it slightly underestimates the number of events over the lake and to its northeast, and slightly overestimates the number of events in the southeast corner of the domain. This indicates that the representation of periods without precipitation improves when going to the CP scale. Individual models tend to show more frequent on-lake rain events towards the northwest corner of the lake, as seen in the mean of the observations, in both PAR and CP models (Fig. S7).

The spatial variability in the 90th, 95th and 99th percentiles of three-hourly observed rainfall are all largely similar (Fig.7), and reflect the mean rainfall distribution (Fig. S1). Ensemble mean errors in these percentiles reflect the too frequent rainfall in parameterised models, with the 90th percentile of the parameterised ensemble being too wet by 0.8 mm/3h, and the 99th percentile being too dry by 2 mm/3h (Table2, see also Fig. 8 for the full rainfall distribution). In contrast, the rainfall in the convection-permitting models represents the 90th percentiles within 0.1 mm/3h, and overestimates the 99th percentiles by 4 mm/3h compared to the observations. For the 95th percentile, which corresponds to 0 to 10 mm in 3 hours in the observations, the convection-permitting models rain-rates are too high over the lake and too low over the surrounding land, reflecting over-intense on-lake rainfall. This problem is particularly pronounced in the convection-permitting UKMO model (Fig. S9). Only the COSMO-CLM PAR model gives both wetter extremes than observed and wetter extremes than the equivalent CP model: this is linked to a double peaked rainfall distribution, with the second peak possibly caused by explicit convection occurring on the coarse parameterised model grid (Fig. S9). Spatial variability in the biases of rainfall extremes and number of rainy events is generally rather small, except for the different behavior over the lake compared to the land. In fact, this spatial variability is smaller than what was found by Onyutha (2020) in historical CORDEX RCM integration compared to in-situ precipitation data.

A comparison of COSMO-CLM-CP with 12 precipitation stations in the Rwenzori Mountains, located in the western branch of the East African Rift System, indicates a much better representation of extreme precipitation metrics in the CP model compared to the IMERG satellite product (Nakulopa et al. 2022). The observational IMERG product underestimates extreme precipitation, with increasing biases for more extreme precipitation. Therefore, the bias in the 99th precipitation percentile in CP (Table 2) might be related to an underestimation in the observational satellite products rather than an overestimation in the CP models. Similarly, the bias in the 99th precipitation percentile in PAR might be more negative in reality than what we found here.

Fig. 5
figure 5

Peak local time of precipitation for the observational mean as well as parametrised and convection-permitting model ensemble mean bias against this observational mean. Both observational mean and model bias means are calculated per pixel as a temporal average, by first converting peak time hours to radians, then averaging both the sine and cosine of the angles and finally converting back the mean sine and cosine results with the arctangent to the mean peak time. Peak times for all observations, models and individual biases are provided in Fig. S3, S4 and S5 respectively

Fig. 6
figure 6

Total number of rainy events (nre) per year for the observational mean, as well as the parametrised and convection-permitting model ensemble mean bias against this observational mean. Values within the observational range are masked white. The nre metric counts the number of three-hourly precipitation events exceeding 0.125 mm/3h. Results for all observations, models and individual biases are provided in Fig. S6, S7 and S8 respectively

Fig. 7
figure 7

Three-hourly 90\(^\mathrm{th}\), 95\(^\mathrm{th}\) and 99\(^\mathrm{th}\) rainfall quantiles for the observational mean as well as the parametrised and convection-permitting model ensemble mean bias against this observational mean. Values within the observational range are masked white

Fig. 8
figure 8

Separate land and lake rainfall contributions of three-hourly intensity bins for the parametrised and convection-permitting model ensemble compared to the observational band. For each intensity bin, the distribution visualizes the precipitation contribution to the total rainfall. Hence, low-intensity events contribute little despite their high frequency of occurrence (left tail), while high-intensity events are rare, also resulting in a small overall contribution (right tail). Integrating over the full distribution results in the total rainfall. The observational minimum and maximum as well as the model mean, are calculated per intensity bin. Results for all individual models are provided in Fig. S9

3.3 Upward shortwave radiation at top-of-the-atmosphere

Fig. 9 shows the annual mean top-of-atmosphere (TOA) upward shortwave radiation biases in the individual PAR and CP simulations and their multi-model ensemble means with respect to the observational mean (CERES, SEVIRI and GERB, see Fig. S10). Both positive and negative biases vary among the simulations without a common spatial pattern. In WRF, bias patterns are almost the same in the PAR and CP simulations. In contrast, COSMO-CLM CP simulates upward shortwave radiation more accurately than PAR and strongly reduces the positive bias. The ALADIN and AROME simulations reproduce upward shortwave radiation quite accurately with a patchy bias pattern over the land area. The UKMO PAR simulation underestimates upward shortwave radiation while the corresponding CP simulation overestimates it. For both PAR and CP scales the model ensemble average outperforms nearly all the individual simulations with biases slightly larger or comparable to differences between the observational datasets. Much of this good performance of the ensemble mean results from the cancellation of oppositely signed biases in the individual runs.

Similar to precipitation (Fig. 3), the annual cycle of upward shortwave radiation has two smooth maxima during the rainy seasons - February-April and October-December (Fig. 10). In general, almost all simulations capture the basic shape of the annual cycle over land but not all are able to accurately simulate its amplitude. At the PAR scales, the difference between the lowest WRF and highest COSMO-CLM simulations during the rainy seasons can reach up to 50-60 W m\(^{-2}\) that is about 50% of the observed values. The spread across the CP simulation is reduced, although it is mostly related to a more accurate annual cycle in the COSMO-CLM CP experiment compared to the PAR one. Over the lake, the spread between the PAR experiments with lowest and highest simulated upward shortwave radiation becomes even larger than for land and can reach up to 100 W m\(^{-2}\) (about 100% of the observed values) in April and November.

The diurnal cycle of upward shortwave radiation over the land and lake areas is shown in Fig.11. The maximum over the lake is observed around 12 LT with some spread across the observations. The timing of the maximum over the land occurs later, at about 14:30 LT, and the observational datasets well agree on the timing. Even though the changes occurring when going to CP scale vary among the models, there are some improvements detected over land when going to CP scale especially by reducing the amplitude of the diurnal cycle in two models (COSMO-CLM and UKMO). In contrast to the precipitation diurnal cycle, most models do not show any delay when going to the CP, except for UKMO. Over lake, some models have an increase and others a decrease in amplitude of the diurnal cycle and there is no consistent improvement nor deterioration when going to CP scale. The CP experiment in UKMO brings the timing in line with the observation, other models already fall in the observed range both in PAR and CP. Note that the ALADIN/AROME simulations provide only 6-hr upward shortwave radiation (shown by 6-hr steps) and a detailed analysis of the diurnal cycle is not possible.

It is clearly seen that in the current ensemble the impact of CP downscaling on upward shortwave radiation at TOA over the LVB is mixed. Biases in the coarser PAR simulations can be reduced, similar or amplified in the CP simulations without a systematic pattern across the models. Additionally, the absence of upward shortwave radiation for RegCM and coarse output frequency for ALADIN/AROME reduces the ensemble size to four (spatial maps and annual cycle) or even to three members (diurnal cycle) instead of the full ensemble.

Fig. 9
figure 9

Averaged upward shortwave radiation at the top-of-the-atmosphere bias for all parametrised and convection-permitting models except RegCM as well as the ensemble mean. The bias is calculated against the observational mean, while model values within the observational range are masked white. Results for all observational products and individual models are provided in Fig. S10 and S11 respectively

Fig. 10
figure 10

Land and lake seasonal cycles of upward shortwave radiation at the top-of-the-atmosphere for all parametrised and convection-permitting models except RegCM compared to the observational band

Fig. 11
figure 11

Land and lake diurnal cycles of upward shortwave radiation at the top-of-the-atmosphere for all parametrised and convection-permitting models except RegCM compared to the observational band in the left subplots. Due to their low temporal resolution, ALADIN-12 and AROME-2.5 are visualised as step functions. Corresponding radiation range (maximum minus minimum) and peak times are indicated in the right subplots, while arrows connect the information for the parametrised (dots) and convection-permitting (stars) simulations

3.4 Upward longwave radiation at top-of-the-atmosphere

With the exception of the CP AROME simulation over high-terrain to the east of the lake, all simulations show a mean overestimation in upward longwave radiation compared to the SEVIRI mean (Fig. 12). In all the simulations the largest biases generally occur to the west of the lake where complex high-terrain and a strong diurnal cycle in convection may contribute to this bias. Lake Victoria itself, another area with a strong diurnal cycle, also shows a maximum in biases in the WRF and RegCM CP simulations. Comparing the PAR simulations to the CP simulations, the WRF, RegCM and COSMO-CLM all show larger biases in the CP simulation, while the UKMO and ALADIN/AROME simulations show an improvement in the bias in the CP simulation. This larger bias in the CP simulations is also evident in the ensemble mean where the CP ensemble shows a somewhat larger upward longwave radiation bias to the west of the lake compared to the PAR ensemble.

The annual cycle in upward longwave radiation is strongly modulated by annual variation in convective activity. In Fig 13 the mean monthly annual cycle in upward longwave radiation among the CP and PAR simulations is shown along with the observational band. Two annual minima in upward longwave radiation mark the peaks in convective activity over the LVB during the March, April, May, and October, November, December rainy seasons. Additionally, most of the simulations and the SEVIRI observations show a peak in upward longwave radiation during the northern hemisphere summer months however the timing of this peak has a large spread among the models. For instance at CP scale, in WRF and RegCM, the summer peak is in September while the UKMO peaks in June. The SEVIRI observations, CP COSMO-CLM, and CP AROME show a summer peak in July. Interestingly, all models generally agree with the SEVIRI observations on the two annual minima in April and in November, while the timing of the annual peak is not consistent among the models and SEVIRI. Most models preform similarly compared to SEVIRI, between land and lake. The models also show a general overestimation in every month which is corroborated by Fig. 12. However, the models which show the largest overestimation (WRF and RegCM) in Fig. 13 seem to suffer from a particularly weak annual cycle.

Also from the analysis of the diurnal cycle a substantial overestimation in TOA upward longwave radiation is identified both over land and lake (Fig. 14), similarly to what was previously reported. Over land, the amplitude of the daily cycle (diurnal range) is underestimated substantially in the PAR models with a clear improvement when going to the CP scale (Fig. 14b). Over the lake, the diurnal range is too large in PAR but also strongly improves when going to the CP scale, on which three out of five models represent the diurnal range within the observational spread (Fig. 14d). Over land, the timing of the daily cycling is quite well represented both in PAR and CP models, but over the lake most models exhibit a too late peak. No systematic improvement nor deterioration is found for this timing when going to the CP scale. Overall the diurnal range in TOA upward longwave radiation improves when going to the CP scale by an increase in the diurnal range over land and a decrease over lake, which corresponds better to the observations.

Fig. 12
figure 12

Averaged upward longwave radiation at the top-of-the-atmosphere bias for all PARametrised (PAR) and Convection-Permitting (CP) models as well as the ensemble mean. The bias is calculated against the observational mean, while model values within the observational range are masked white. Results for all observational products and individual models are provided in Fig. S12 and S13 respectively

Fig. 13
figure 13

Land and lake seasonal cycles of upward longwave radiation at the top-of-the-atmosphere for all parametrised and convection-permitting models compared to the observational band

Fig. 14
figure 14

Land and lake diurnal cycles of upward longwave radiation at the top-of-the-atmosphere for all parametrised and convection-permitting models compared to the observational band in the left subplots. Due to their low temporal resolution, ALADIN-12 and AROME-2.5 are visualised as step functions, while RegCM-25’s 6 hourly instantaneous values as dots. Corresponding radiation range (maximum minus minimum) and peak times are indicated in the right subplots, while arrows connect the information for the parametrised (dots) and convection-permitting (stars) simulations

3.5 Lake temperature

Over Lake Victoria, annual average observed lake temperatures vary between 297.9 and 299.2 K, with the highest values occurring in the northeast of the lake (Fig. 15). Averaged across the lake, monthly mean surface temperatures show only a limited seasonality, with minimum values of 297.5 K during July and August and maximum values of around 299.0 K from March to April (Fig. 16). The lake surface temperature seasonality is primarily linked to seasonal variations in the latent heat flux, with drier near-surface atmospheric conditions during the boreal summer months enabling enhanced evaporation-driven cooling (Thiery et al. 2014; Docquier et al. 2016). While most models display a similar seasonality, they often suffer from a systematic bias. In five out of the ten simulations, an annual warm bias of more than 1 K is found. Increasing the model’s horizontal resolution reduces model bias in the case of COSMO-CLM, ALADIN/AROME and RegCM, but increases the bias for WRF. Note that both UKMO simulations display no bias, as they used the climatology of the ARC-Lake product as a lower boundary condition (Stratton et al. 2018). Though most models reproduce the general spatial pattern with increasing annual average lake temperatures from southwest to northeast, the simulations display a large spread in terms of spatial temperature pattern, especially in coastal zones (Fig. 15).

In the case of ALADIN/AROME, a reduction of the warm lake surface temperature bias (especially in the coastal zones) in the high-resolution simulation is accompanied with a substantial reduction in the over-lake wet bias (Figs. 2, 15). Increased spatial resolution in COSMO-CLM reduces the warm bias near the eastern lake shore accompanied by a mild reduction in the wet bias in this sector of the lake. For both models, the improved representation of the top-of-atmosphere shortwave radiation near the center of the lake (Fig. 10) suggests that the cloud cover is better represented in the high-resolution simulation, potentially caused by a more realistic representation of surface-triggered convection. However, in general it is difficult to attribute changes in over-lake precipitation skill directly to different lake surface temperature patterns, given that other factors such explicit convection and enhanced orographic detail confound the precipitation response to lake surface temperatures.

Fig. 15
figure 15

Standarized lake surface temperatures for ARC-Lake observational product as well as all parametrised and convection-permitting models. This normalisation allows for a relative lake surface temperature heterogeneity comparison, and is realised by subtracting the lake average \(\overline{T_l}\) and dividing by the standard deviation \(\sigma _l\). Values for both parameters are provided in Table 3

Fig. 16
figure 16

Lake-averaged surface temperature seasonal cycles for all parameterized and convection-permitting models compared to ARC-Lake observational band, representing all different years

Table 3 Standardization parameters for lake surface temperatures of ARC-Lake observational product as well as all parametrised and convection-permitting models. The lake surface temperature normalisation is realised by subtracting the lake average \(\overline{T_l}\) and dividing by the standard deviation \(\sigma _l\). Resulting standardized lake surface temperature patterns are shown in Fig. 15

4 Conclusion

This study investigates the representation of moist convection systems in a new multi-model ensemble for the Lake Victoria Basin (LVB) in Equatorial East Africa. This multi-model ensemble was recently generated under the framework CORDEX Flagship Pilot Study ELVIC (climate Extremes in the Lake VIctoria basin). The ensemble includes five regional climate models, namely COSMO-CLM, RegCM, AROME, WRF and UKMO. A 10 year period (2006–2015) was chosen, plus the year 2005 as spin-up. Pairs of model integrations were performed namely one at the 10 kilometer scale (12–25 km) using PARameterized (PAR) convection as well as one at the kilometer scale (2.5–4.5 km) or Convection-Permitting (CP) scale. A protocol was set up to streamline the model integrations but deviations from the protocol were allowed for since limited resources are available for this computationally expensive research for Africa. Moreover, the integrations were used in other project for which different requirements were set.

The model performance was assessed against available observational products. When more products are available, we decided to use all in order to have both an estimate of the observed value as well as on the uncertainty range associated. For precipitation, seven products were used, whereas three products were employed for top-of-atmosphere radiation. Lake surface temperature was evaluated using the ARC-Lake dataset.

For time scales of a month or longer, PAR models are just as good as CP models, with no clear systematic improvements nor deteriorations when going from the PAR to CP scale. For example, lake average absolute biases improve by up to 1108 mm yr\(^{-1}\) or deteriorate by up to 1560 mm yr\(^{-1}\) for the individual models. This stresses the value of PAR regional climate models in CORDEX downscaling of global climate models. The spatial distribution of total rainfall and the seasonal cycle were affected in individual models when moving to CP scale, but the sign of this change and the bias reduction or increase varied from model to model. Even though an attempt was made to better understand individual model deficiencies by evaluating lake surface temperature patterns, it was found to be difficult to attribute changes in over-lake precipitation skill.

Remarkably high biases in the multi-annual averages (up to 30 W m\(^{-2}\)) and seasonal cycle in Top-Of-Atmosphere (TOA) upward radiative fluxes were found, both in PAR and CP models. Interestingly, all models show an overestimation of TOA upward longwave radiation at both resolutions, while individual models exhibit both over- and underestimations of the TOA upward shortwave radiation. We recommend including TOA radiative fluxes in standard model evaluation of regional climate models. After all, these variables can be measured with limited observational uncertainty, and they determine the energy balance of the combined atmosphere land system. A bias in these variables might reveal a misrepresentation of the vertical distribution of clouds in the models (Brisson et al. 2016).

For metrics determined by daily or sub-daily variations, substantial systematic improvements were found when going to the CP scale. The diurnal cycle in precipitation substantially improves with a systematic delay in the daily precipitation maximum in all models. Peak time of precipitation strongly improves over land, again by delaying the precipitation maximum by almost three hours in the CP ensemble compared to PAR, thereby reducing the bias by 67%. A remarkable improvement in peak time of precipitation is also identified close to the lake coast, where most people in LBV live. Here, the daily cycle of the PAR ensemble was out of phase with the observed cycle, whereas the improved representation of the coastline in the CP ensemble alleviates this substantial bias. Note that not only for precipitation but also for the radiative fluxes at the TOA, the amplitude of the diurnal range improves, especially for the longwave with the bias in CP decreasing from 15.7 to 7.7 W/m\(^2\) compared to PAR. The substantial overestimation of the total number of rainy events is relieved when going to the CP scale, with a multi-model mean bias over the entire evaluation domain of 313 events per year for the PAR ensemble to a bias of -65 events per year for the CP ensemble. This indicates that not only the extreme precipitation, but also the representation of periods without precipitation improves. The overestimation by 0.8 mm/3h in the 90th rainfall percentile in the PAR ensemble is alleviated in CP (<0.1 mm/3h). For the 99th percentile of precipitation, a deterioration is found in CP compared to PAR (4 mm/3h too wet in CP and 2 mm/3h too dry in PAR), which might be explained by an underestimation of extreme precipitation in the satellite products. Indeed, Ageet et al. (2022) showed that satellite products used herein tend to underestimate extremes at 95th percentile at varying degrees, compared to gauge observations from Uganda and neighbouring countries. Moreover, Nakulopa et al. (2022) demonstrated a better correspondence of extreme precipitation in COSMO-CLM-CP than in the IMERG satellite product compared to 12 precipitation stations in the Rwenzori Mountains. Improved precipitation observations in the LVB at high temporal frequency are therefore key to improve our understanding of how CP models improve the very extreme precipitation (>95th percentile).

Our results confirm the statement made by Thiery et al. (2016) who advocated coordinated high resolution projections for the LVB, based on the improved performance of extreme 6 hourly precipitation in a COSMO-CLM2 simulation at \(\sim\)7 km grid spacing compared to CORDEX models and state-of-the-art re analyses. The substantial systematic improvements in the representation of precipitation and TOA radiation on the daily to sub-daily timescale are indicative of a better representation of the convective systems when going to CP scale. Together with the fact that the CP models are explicitly modelling the convection, this suggests that CP models are valuable tools for future climate projections of extreme events in equatorial regions. Moreover, the complex orography of the LVB makes this region particularly suited for kilometer-scale climate projections. Within the ELVIC consortium, future climate projections at CP scale are planned to study how extreme precipitation events evolve in the future in the LVB and its surroundings.