1 Introduction

In the last decades, the relation between the increasing impact of extreme events and climate change has been widely studied and demonstrated (Intergovernmental Panel on Climate Change (IPCC) 2021). The increase in extreme weather events occurrence with the related impact in terms of loss of life, environmental and economic damages are a global concern. Indeed, since 2015, all United Nations Member States defined the 17 Sustainable Development Goals (SDGs), which are an urgent call to action for tackling climate change (Sachs et al. 2021). The scientific community has recently multiplied its efforts to identify short-term forecasting tools for such extreme events with the aim of reducing their impact on the society. In this framework, the strong correlation between lightning and extreme events has largely been discussed (Price and Rind 1994; Romps et al. 2014; Banerjee et al. 2014; Clark et al. 2017). Many authors agree in considering a positive correlation between increasing surface temperatures and lightning activity (Romps et al. 2014; Banerjee et al. 2014; Clark et al. 2017), but such correlation does not find general agreement (Finney et al. 2018). Nevertheless, the strong connection between lightning and natural events, such as hail, tornadoes, and heavy rainfalls has been widely documented (Tapia et al. 1998; Adamo et al. 2009; Schultz et al. 2015, 2017; Lagasio et al. 2017; Lynn 2017) as well as the dangerous effects that Cloud-to-Ground (CG) lightning may have on infrastructures (such as wind turbines and transmission/distribution lines and buried cables), human lives and as leading phenomenon for forest fires (Cooper and Holle 2019). The complexity driving to the lightning phenomenon and its possible correlation with severe weather events is the driver for researchers in deepen the analyses in the field of short-term forecasting tool based on the lightning activity as early indicator for extreme events. Since lightning is an electric discharge characterizing both mesoscale and microscale events that exhibit sudden evolution and complex interaction with the surrounding atmosphere, the possibility of having reliable information on the location of future intense lightning activity has many fields of application. Indeed, a short-term forecasting tool may help identifying and monitoring the evolution of extreme events in near-real time. In particular, a timely forecast may support early warning systems by detecting unexpected changes in the lightning activity, identifying the possibility of increment in severity of the event in the following hours and giving decision makers updated information to take the necessary safety measures. However, nowcasting activities related to lightning events are still a great challenge and, considering the complicated interaction between in-cloud and many atmospheric processes driving to the lightning phenomenon, a wide range of approaches is available for such purposes. Many studies have implemented lightning forecasting models based either on the electrification in-cloud processes causing lightning or on atmospheric variables linked with the lightning phenomenon, such as Convective Available Potential Energy (CAPE) and precipitation rate (McCaul Jr et al. 2009, 2020; Lynn et al. 2012; Fierro et al. 2014; Tippett and Koshak 2018). Early approaches to lightning forecasting were based on analytical studies relating storm lightning rates to convective cloud top height (Price and Rind 1992). Later, other researchers based forecast methods on the statistical use of lightning climatologies (Bothwell 2005), i.e., an approach incorporating lightning climatology (e.g., thunderstorm activity, peaks in diurnal CG lightning activity, etc.) and predictor fields from Numerical Weather Prediction (NWP) models. Other researchers used methods based on measures of predicted buoyant instability aloft, as derived from numerical simulations (Bright et al. 2005), i.e., methods helping in delineating potential thunderstorm areas by determining if instability and appropriate thermodynamics for charge separation are coincident in observations and model forecasts. Moreover, in recent decades, elaborated full electrification schemes were developed for inclusion in explicit-convection numerical forecast models, e.g., (Mansell et al. 2002; Fierro et al. 2007; Lynn 2017). These latter methods allow a detailed insight into storm electrical behavior and provide forecasts of the Flash Rate Density (FRD) and flash locations, considering the whole CG and intra-cloud (IC) activity. However, even after simplification of the complex lightning discharge process, most full electrification schemes remain computationally intensive and are still subject to errors in their quantitative forecasts of lightning event flash rates, owing to the intrinsic low predictability of deep convection in the parent explicit-convection model (McCaul Jr et al. 2020). Consequently, the accuracy of FRD results obtained up to now varies on a day-to-day basis, owing to the limitations of model physical parameterizations, input data and procedures, and model numerics. All of these factors produce inherent uncertainty in the model forecasts (McCaul Jr et al. 2020). The increasing computational capability in the last ten years have supported researchers to investigate the sensitivity of NWP models to horizontal grid spacing variability, evaluating how to properly calibrate and interpret ensemble output and to optimize trade-offs between model resolution and other computationally constrained parameters (Kain et al. 2008; Fiori et al. 2010; Bryan and Morrison 2012; VandenBerg et al. 2014; Potvin and Flora 2015).

In this framework, in recent decades, Machine Learning (ML) helped improving the prediction skills of multi-data weather related phenomena thanks to the integration of its strengths with atmospheric science. The ability of ML algorithms of modelling highly nonlinear functions is fundamental for its application in the analysis of the spatial and temporal variability of geo-environmental data (Kanevski et al. 2009), sometimes allowing uncertainty estimation (James et al. 2013; Guignard et al. 2021). Moreover, ML tools can take advantage of large datasets and of the use of numerous input variables, or features (James et al. 2013). Therefore, they can solve regression and classification problems in high-dimensional (geo)input spaces, generally constituted by the geographical space and a set of spatially, temporally, or spatio-temporally referenced features. Different ML algorithms have been successfully applied to model phenomena such as, among the others, water pollution (Leuenberger and Kanevski 2015), landslides (Taalab et al. 2018) and forest fires (Tonini et al. 2020), susceptibility, air temperature (Amato et al. 2020), wind speed (Veronesi et al. 2016). In recent years, some ML algorithms have been developed also in the domain of lightning occurrence forecasting. Mostajabi et al. (2019) used XGBoost to perform 30 min ahead forecasting of lightning occurrence based on a set of single-site observations of meteorological parameters. (Blouin et al. 2016) used a tree-based classification algorithm to predict 6-h and 24-h CG lightning. A 1-h nowcasting model is proposed in (Mecikalski et al. 2015), showing that lightning forecast are made 45 min before rainfall occurs. Differently, (Zhou et al. 2020) developed a 0–1-h nowcasting model based on data from geostationary meteorological satellites as input for a Deep Learning (DL) algorithm. Since satellite data have the advantage of monitoring and detecting the initial stages of convective clouds, they can be used for convective initiation (CI) monitoring and early warnings, when the detection over one spot is available.

As well known, when dealing with ML algorithms, one of the most important issues is the preprocessing phase to be performed on the dataset to be used as input for the procedure. In meteorological application, normally, different data (either observed or outputs of numerical models) could be available at different spatial scales (e.g., punctual, like lightning occurrence, or on the nodes of a grid in case of outputs of NWP models). However, to feed a ML algorithm, all data used must be referred to a common spatial grid. Since literature on specific applications of ML techniques to the nowcasting of lightning is quite limited, a study investigating the effects of such grid spacing on the final ability of the tool to nowcast lightning is still missing. The main goal of this study is to fill in this gap, providing a preliminary study of a more general research that aims at assessing the ability of ML techniques to nowcast lightning activity. To this final goal, a comparison among results obtained using three different horizontal grids for data interpolation is presented, to test the sensitivity of a ML algorithm in detecting the processes driving to the development of CG lightning. Specifically, a 1-h ahead forecasting is performed over the Area Of Interest (AOI), here covering the Italian territory and the surrounding seas, by classifying each pixel of the study area with the presence or absence of CG strokes. The classification is performed with Random Forest (RF). The comparison is performed over 3 months of 3 years (i.e., August, September, and October from 2017 to 2019).

The paper is organized as follows: Sect. 2 presents the methodology used, discussing input data, the horizontal spatial resolutions compared, and the classification algorithm utilized, Sect. 3 shows results and proposes a related discussion, Sect. 4 provides the conclusion of the paper.

2 Methodology

2.1 Input data

Aiming at enhancing results obtained in (La Fata et al. 2021), the models proposed in this study are trained with meteorological data covering the period from August to October of 2017, 2018 and tested with data related to the same months in 2019. This choice is due to the particularly high lightning activity over the Italian peninsula and the surrounding seas during August-October 2018, as shown in (Nicora et al. 2021) and (Paliaga et al. 2019). At the intense lightning activity occurred in this period corresponded also a significant amount of precipitation, as shown in Nicora et al. (2021) and the fact that, at European scale, August 2018 was the fourth warmest from 1880 after 2016, 2017, 2015 (Paliaga et al. 2019). The AOI has been chosen to have a reasonably comparable number of pixels over land and sea, as shown in Fig. 1. Pixels over which the AOI has been analyzed include a sea area of Asea = 448 × 734 km2 and a land area of Aland = 301 × 38 km2, thus Asea is ~ 1.5 times Aland. Data have been interpolated on 3 different spatial grids covering the AOI. All features of each pixel are organized in a Data Frame (DF) and only pixels containing all data are considered to train the models, i.e., if one feature is missing in a pixel, the pixel is excluded from the DF. Considering that the final goal of the work is to create an operative tool working in near real time, in this study the selection of the input variables is performed considering their availability, the spatio-temporal resolution and the easiness in the retrieve and process phases, i.e., the timings needed to receive, download, and process the data. Hence, all data to be used in the DF (from either observations/direct measurements or radar/satellites measurements) must be available sufficiently in advance, allowing analysts to pre-process them and run the forecasting algorithm.

Fig. 1
figure 1

AOI for algorithm application

The features chosen to train the model range from spatio-temporal information to observations or forecasted data produced by NWP, i.e.,

  • Latitude and Longitude: coordinates belong to the spatial matrix on which all the features are gridded. Outcomes of the 10 years analyses in (Nicora et al. 2021) show a significant difference in the lightning activity along the Italian peninsula. Moreover, the correlation between lightning activity and the latitude is also exposed in (Underwood 2006; Rakov 2013; Enno et al. 2020).

  • Digital Elevation Model (DEM [m ASL]): analyses performed in (Paliaga et al. 2019) suggest that the orographic effect may be considered as possible influencing factor enhancing the seasonal difference between the lightning activity over sea and land. Moreover, in (Underwood 2006; Mazarakis et al. 2008; Vogt and Hodanish 2014) the relative flash density was found to be correlated with the terrain elevation and also (Poelman 2014) associates the CG lightning peak currents with the terrain elevation. Nevertheless (Kotroni and Lagouvardos 2008) states that a positive correlation between lightning activity and terrain elevation is evident during spring and summer but not in autumn and winter. Consequently, aiming at creating a generalized ML algorithm, the terrain elevation is added as input feature.

  • Temperature: a positive correlation between lightning and Sea Surface Temperature (SST) in autumn is shown in (Kotroni and Lagouvardos 2016), where it is suggested that the reason may lie in the fact that higher SST destabilizes the lower tropospheric layers, thus enhancing convection and therefore lightning. Analyses in (Kotroni and Lagouvardos 2016) also suggest that this finding could be used to forecast the intensity of lightning activity. Similar considerations can be found in (Nicora et al. 2021), in which, during autumn, a strong lightning and precipitation activity is detected and linked with the strong interaction between a warmer SST and a significant amount of instable moist air. Moreover, the lightning activity has been linked with the solar heating cycle in (Enno et al. 2020); indeed, a summer peak in the lightning activity was observed at mid-latitudes whereas the Mediterranean experienced an autumn maximum. Most of the lightning occurred over land from March to August, whereas from September to February it was concentrated over the Mediterranean. Consequently, the Infrared brightness temperature [K] measured by Meteosat Second Generation in the infrared channel (10.8 MHz) is used as input parameter for the creation of the ML algorithm. The refresh time is 15 min. Since all other features in the input space have a temporal resolution of 1 h, the hourly temperature mean value and hourly standard deviation are computed.

  • Zonal u (or x-coordinate) and meridional v (or y-coordinate) components of horizontal wind vector [ms−1]: The microphysical and kinematic characteristics of the relationship between lightning and convective processes have been extensively studied. Particularly, because lightning needs an electric field region to be initiated, flash initiations tend to cluster in the vicinity of updraft cores, where either sedimentation alone or sedimentation combined with wind shear or turbulence produces gradients in the charged particles, creating the electric fields needed to initiate lightning (Calhoun et al. 2013). Consequently, both zonal and meridional wind components are used in the DF at 4 pressure levels (1000, 850, 700, 500 hPa) to consider the wind shear effect. The components are provided by the daily forecast of the COnsortium for Small-scale Modeling (COSMO)-I5 model over the period under investigation. COSMO-I5 (Steppeler et al. 2003) is the limited-area, non-hydrostatic model used by the COSMO consortium with 2 daily run (00UTC-12UTC) and 72 h of forecast over the Mediterranean basin.

  • Vorticity: many studies demonstrate that deep atmospheric convective processes are characterized by intense vertical velocities, able to reach the zone of the atmosphere where lightning phenomena occurs (Petersen et al. 1999; Deierling and Petersen 2008; Wang et al. 2015; Huang 2021). Findings in (Mazarakis et al. 2008) confirm such idea. The data availability for this analysis included only the zonal and meridional components of the horizontal wind vector, thus, the vertical component of vorticity, \({\omega}_{z}\), has been calculated at 1000 hPa to support the ML algorithm in identifying the most intense convection zones at lower level of the atmosphere, i.e., (Holton 1992)

    $$\vec{\omega }_{z} = \left( {\frac{\partial v}{{\partial x}} - \frac{\partial u}{{\partial y}}} \right)$$
    (1)

    where the spatial derivative has been approximated with first order Finite Difference.

  • Precipitation: Many studies (Adamo et al. 2009; Tapia et al. 1998; Soula and Chauzy 2001; Adamo et al. 2009; Lagasio et al. 2017; Soula and Chauzy 2001) have revealed the strong intercorrelation between the lightning phenomena and severe rainfall process evolution in thunderstorms, confirming the hypothesis that lightning activity may be useful to track the convective cores’ motion associated with severe rainfall processes. Consequently, measured precipitation data are included as input feature in the DF to train the models. From the observational point of view, the 1-h precipitation accumulation from the Integrated Multi-satellitE Retrievals for GPM (IMERG) is used. Moreover, to validate data retrieved from COSMO-I5 model, the precipitation forecasted by COSMO-I5 is compared with precipitation values detected by the Italian radar network, obtained with the Modified Conditional Merging (MCM) technique. Precipitation data computed by COSMO-I5 model are considered to be validated if:

    $$\left| {rr_{measured} - rr_{COSMO} } \right| \le T$$
    (2)

    where \({rr}_{measured}\) is the precipitation from radar, \({rr}_{COSMO}\) represents the precipitation modelled by COSMO-I5 and T is a threshold, defined for 4 different hourly cumulated rainfall levels, as shown in Table 1.

  • Distance Sea/land: analyses in (Nicora et al. 2021; Paliaga et al. 2019; Enno et al. 2020) found that the lightning activity may differ over sea and land areas. Moreover results in (La Fata et al. 2021) suggest that longitude may have a relevant importance when trying to perform a classification of pixels with or without CG strokes. Consequently, the distance from each land pixel from the sea has been added as feature in the input space.

    All the above-mentioned input data have different spatial and temporal resolution; thus, interpolation is needed to create a DF to train and test the ML algorithm. Ideally, to introduce the fewest approximation, the influence of interpolation errors needs to be as reduced as possible. Considering the data availability, their different temporal and horizontal spatial resolution and since it is unknown a priori what’s the best resolution on which interpolate all the data, the accuracy in the forecasts obtained with the 3 spatial grids available are compared:

    Horizontal grid of Temperature data (HT): 0.05° × 0.0611°,

    Horizontal grid of Wind components data (HW): average pixel dimension of 0.045° × 0.0681°,

    Horizontal grid of Precipitation data (HP): 0.1° × 0.1°.

Table 1 Thresholds used to validate the COSMO I5 model

The temporal resolution is set to 1 h for all input data.

The output of each model created is a Boolean variable indicating the presence or absence of CG strokes in a specific spatial location in the following hour. For all the three spatial resolutions over which input data have been gridded, two models have been created, one aiming at forecasting pixels in which positive strokes are detected, one aiming at forecasting pixels in which negative strokes are detected. The reason of the creation of differentiated models for positive and negative strokes lays in the different numerosity and creation mechanism. (Diendorfer et al. 2009) and (Cooray and Arevalo 2017) say that negative lightning flashes account for about 90% or more of global CG lightning, and that 10% or less of CG discharges transport positive charge to Earth. Moreover (Rakov 2013) and (Diendorfer et al. 2009) state that positive lightning can be the dominant type of CG lightning during the cold season, during the dissipating stage of a thunderstorm, and in some other situations including severe storms and thunderclouds formed over forest fires or contaminated by smoke. Differences in the scenarios leading to positive and negative lightning can be found in (Nag and Rakov 2012) and (Cooray and Arevalo 2017). (Nag and Rakov 2012) also states that several properties of positive lightning (e.g., number of strokes per flash, occurrence of continuing current, leader propagation mode, and branching) appear to be distinctly different from those of negative lightning.

As stated before, the pattern recognition models created seek the presence or absence of CG strokes in a specific pixel in the following hour. The source of CG strokes data is the Ground Stroke Density (GSD) available thanks to CESI-SIRF Lightning Location System (LLS), now property of Meteorage SAS. Sensors of CESI-SIRF LLS were installed in 1994 and consist today of VAISALA LS7002 sensors over the Italian territory. SIRF is a founding member of the EUCLID (EUropean Cooperation for LIghtning Detection) network, a pan European union aiming at sharing LLSs’ data. All sensors of EUCLID operate in the same frequency range. The sensors’ redundancy of the EUCLID network allows the Italian territory to be covered by theoretically 150 sensors; actually at least by the 15 sensors closest to the national borders. Strokes data have been processed defining a Boolean matrix whose entries are 1 in correspondence of pixels in which at least one stroke is detected (class 1) and 0 elsewhere (class 0). The DF over the AOI, for each hour, is composed of pixels including the 16 abovementioned features and the 17th dimension representing the target, i.e., the presence/absence of strokes. The resulting DF is extremely unbalanced, e.g., 75% of pixels have less than 2 strokes and, excluding pixels containing no strokes, 75% of pixels of the reduced DF includes less than 5 strokes. Sub-Sect. 2.2 will introduce the balancing methodology applied to face this issue.

2.2 Classification algorithm

The problem of determining the presence or absence of strokes in a specific spatial location during the next hour has been formulated in terms of classification of each pixel of the AOI. The classification has been performed using RF. The choice is due to the demonstrated potential and benefits in the prediction ability of the RF technique for nowcasting problems strictly correlated with the lightning phenomenon, such as the prediction of small-scale storm initiation, diagnosis of turbulence, mesoscale convective system initiation and lightning activity, respectively shown in (Breiman 2001; Williams 2014; Blouin et al. 2016; Ahijevych et al. 2016). RF is an ensemble method based on Classification and Regression Decision Trees, originally introduced in (Breiman 2001). A classification tree is an algorithm in which the input space is divided into \(L\) non-overlapping regions R1, R2, …, RL. A recursive binary splitting starting from the top of the tree is used to formulate the RL regions: on the top of the tree all the observations are still included in one single region and, successively, each region of the predictor space is split into two new ones. Specifically, calling \(s\) the cut point and for any \(l\in L\), the pair of half-plains produced at each step is defined as:

$$R_{1} \left( {l,s} \right) = \left\{ {X\left| {X_{l} < s\} \quad and\quad R_{2} \left( {l,s} \right) = \{ X} \right|X_{l} \ge s} \right\}$$
(3)

where \(\{X|{X}_{l}\}\) is the conditional probability related to \(X\) given \({X}_{l}\), i.e., a measure of the probability of occurrence of the event \(X\) given that another event (\({X}_{l}\)) has already occurred. Once the \({R}_{L}\) regions are defined, predictions are simply computed by assigning to each sample the label of the most commonly occurring class of training observation in its same region. At each step of the binary splitting, the values of \(s\) and \(l\) are obtained by minimizing the Gini index G, computed as:

$$G = \mathop \sum \limits_{k = 1}^{K} \hat{p}_{mk} \left( {1 - \hat{p}_{mk} } \right)$$
(4)

where \({\widehat{p}}_{mk}\) is the proportion of the training observations in the \(m\)th region belonging to the \(k\)th class. RF is built with the ensemble of several trees generated from bootstrapped training samples. However, to ensure a proper decorrelation among the trees, only a subset \(m\) of the \(p\) features is chosen as split candidate for each single tree. An advantage of this approach is the possibility to study the importance of each feature, hence improving the understanding of the phenomenon analysed. Specifically, it is possible to average the reduction of the Gini index due to splits over a given predictor and averaged for all the trees. Therefore, high values of this metric will correspond to more important predictors.

To apply RF over the AOI for the creation of the ML algorithms to nowcast lightning, data have been split into a train and a test set and organized as follows: pixels containing at least 1 stroke (class1) are counted; pixels containing 0 strokes (class 0) are randomly selected among pixels of the DF where the precipitation forecasted by COSMO-I5 model is validated, as indicated Table 1, so that class 0 and class 1 are balanced (i.e. contain the same number of pixels). The balancing procedure has been performed for a DF containing only locations of positive strokes and for a DF containing only locations of negative strokes. The resulting DF is composed of balanced 0–1 pixels related to August–September and October 2017 and 2018 as train set and pixels related to August–September and October 2019 as test set. Although the approach adopted to split the data considers neither the spatial nor the temporal dependencies among data, these are explicitly taken into account by the addition into the input space of the geographical coordinates (latitude and longitude). Once normalized the input features in the interval [0,1], a five-fold Cross-Validation (CV) has been performed to train RF and determine optimal values for the parameters.

2.3 Model evaluation metrics

Even in high activity regions, CG lightning strikes are rare events (Mostajabi et al. 2019). It is important for the nowcasting tool to correctly predict both lightning and non-lightning events, being the latter numerically dominant among lightning events. However, while a low false alarm rate is desirable, it is not indicative of good predictive skills of the algorithm. In other words, in imbalanced databases, as it is the case of the study presented, the overall accuracy may not be sufficient to correctly evaluate the significance by which the prediction scheme performs better than chance. To fill the gap, in the lightning field, the literature suggests some evaluation metrics. For the purposes of the study, the evaluation results are compared by means of three common indices in forecasting rare events: Probability of Detection (POD), False Alarm Rate (FAR) and Accuracy. The elements to compute such parameters are obtained by means of the Confusion Matrices (CMs) of the models created. CMs are composed of 4 elements: True Negatives (TN, i.e., pixels belonging to class 0 correctly labelled as class 0), True Positives (TP, i.e., pixels belonging to class 1 correctly labelled as class 1), False Positives (FP, i.e., pixels belonging to class 0 wrongly labelled as class 1) and False Negatives (FN, i.e., pixels belonging to class 1 wrongly labelled as class 0). POD is calculated as

$$POD = \frac{TP}{{TP + FN}} ,$$
(5)

FAR is calculated as

$$FAR = \frac{FP}{{TP + FP}} ,$$
(6)

and the Accuracy is calculated as

$$Accuracy = \frac{TP + TN}{{TP + TN + FP + FN}}.$$
(7)

3 Results and discussion

3.1 Results

The CMs of the best models obtained with the RF technique with data gridded over 3 different horizontal spatial resolutions have been calculated for all data selected for August–October 2019. The predicted probability threshold is set to 0.5, i.e., a pixel is labelled as belonging to class 1 if the probability associated to such pixel calculated by the model is ≥ 0.5. To compare results obtained using HT, HW and HP resolutions, typical parameters related to lightning detection are computed: the POD, calculated as defined by (5), the False FAR, calculated as shown in (6), and the Accuracy, as defined by (7). Results are summarized in Tables 2 and 3 and shown in Fig. 2. Considering POD, FAR and Accuracy, the best performing model is the one for which input data have been re-gridded using HW resolution, confirming the results showing that higher resolutions of data allow to create more reliable and better performing models (Kain et al. 2008) (VandenBerg et al. 2014) (Potvin and Flora 2015). Consequently, the features’ importance resulting from the best RF models obtained with data re-gridded using HW is shown in Fig. 3. For data gridded with HW resolution, for both positive and negative strokes, precipitation resulted to be the highest impacting feature, highlighting the ability of the model to link the lightning phenomenon and the meteorological variables driving to the creation of cumulonimbus and precipitation, in agreement also with outcomes of (Paliaga et al. 2019) (Enno et al. 2020) (Nicora et al. 2021). Longitude resulted to be the second most important feature. Nevertheless, the variable representing the distance from land to sea surface is the lowest impacting for positive strokes and the second lowest impacting for negative strokes. The importance related to wind components varies with pressure levels: for both positive and negative strokes the importance of wind components decreases as pressure level increases, i.e., zonal and meridional wind components at 500 hPa is the highest impacting feature among such data. The importance of temperature data is located between wind components at lower pressure levels. This may be since temperature values are available thanks to infrared satellite measurements. Since the source of lightning is usually a cumulonimbus (thundercloud) (Rakov 2013), it is reasonable to think that temperature values over a cloud area are linked with wind components at lower pressure levels (i.e., 500–700 hPa). The low importance related to Latitude and DEM show how the impact of meteorological features describing the atmosphere at higher altitudes is more relevant than the impact related to topology data.

Table 2 Percentage values of POD, FAR and Accuracy for Positive strokes
Table 3 Percentage values of POD, FAR and Accuracy for Negative strokes
Fig. 2
figure 2

Histograms of the percentage values of POD, FAR and Accuracy on the Test set obtained with the resulting best RF models for positive (left) and negative (right) strokes. Orange bars refer to the model created re-gridding all data of the DF over the HP resolution. Yellow bars refer to the model created re-gridding all data of the DF over the HW resolution. Green bars refer to the model created re-gridding all data of the DF over the HT resolution

Fig. 3
figure 3

Features’ importance of the resulting best RF model obtained for data gridded with HW resolution, for positive (left) and negative (right) strokes. Wind vorticity is labelled as “vertical_vorticity_wind_1000”, zonal (meridional) components of horizontal wind are labelled as wind_u (wind_v) followed by the pressure level (e.g., “wind_u_1000”)

3.2 Discussion

Accuracy values, reaching values higher than 57% and the FAR values lower than 37%, for both positive and negative strokes, highlight the ability of the models to deal with forecasts with all the three different horizontal spatial resolutions. Nevertheless, a reliable nowcasting tool should reach higher Accuracies. Thus, in the following paragraph, possible solutions are explored and further data to be added in the input space are discussed.

Results of Accuracy obtained with HP and HT resolutions are approximately similar for both positive and negative strokes, while they are differentiated when calculating POD and FAR. Despite, for negative strokes, FAR for data gridded on HT resolution is lower than FAR for data gridded on HP resolution, POD for data gridded on HP resolution is higher. At an overall look, for both positive and negative strokes, the best performing algorithm is the one with data gridded using COSMO resolution, i.e., HW. Figure 2 shows how the models with data gridded over HW reaches the highest POD and Accuracy and the lowest FAR. This result may be attributed to the fact that half of the features used to train the models are available on COSMO resolution, i.e., all zonal and meridional wind components. Thus, training the model re-gridding all data using COSMO resolution introduce interpolation errors on less than half among the variables (the vertical vorticity of wind at 1000 hPa is computed starting from wind data). Differences in the performances reached by models trained with data gridded using HW resolution support literature results indicating better results when using high spatial resolution. Distinction in the results obtained for positive and negative strokes may be also attributed to the differences in the creation mechanisms (Nag and Rakov 2012) and (Cooray and Arevalo 2017).

Results reached confirm the yet high POD and low FAR obtained in (Blouin et al. 2016), in which a lower spatial resolution was used, and in (Mostajabi et al. 2019), in which the best results were obtained with an XGBoost algorithm to perform 30 min ahead forecasting of lightning occurrence based on a set of single-site observations of meteorological parameters. Differently, results in (Zhou et al. 2020), obtained using data from geostationary meteorological satellites as input to create a DL algorithm, reach higher performances with respect to the presented model when dealing with hours characterized by high intense lightning activity. Consequently, future enhancement of the models here presented may consider the introduction of satellite and radar data as input parameters, with a sufficient resolution (Weisman et al. 1997) (Skamarock 2004) (Kain et al. 2006) able to detect and capture CI and its evolution.

The results presented in this paper must be intended as part of an ongoing research, and many relevant questions still must be investigated. Future studies will deal with the optimization of the spatio-temporal correlation among data. Indeed, for all the presented models, the spatial structure of the data was considered by explicitly adding the geographical coordinates in the input space. Nevertheless, the effect of neighbouring observation in both space and time was not considered in the proposed models. Embedding strategies may be investigated to better consider the spatio-temporal nature of the investigated phenomena. The use of DL algorithms may also ensure a better modelling of the spatio-temporal dependences typical of environmental data because of their capability to automatically extract features in both spatial and temporal domains. Finally, this manuscript analysed the forecasting problem in terms of binary classification of the presence/absence of strokes, only considering CG lightning. Further variables allowing to distinguish the creation mechanisms for positive and negative lightning should be added in the input space in future studies. Consequently, future improvements of the work presented in this paper will include investigations related to embedding strategies and the definition of a further classification or regression analysis:

  • one option could be the definition of a threshold related to the typical values of the GSD, as reported in (Nicora et al. 2021) and (Paliaga et al. 2019) and references therein: the classification problem gives an indication in case such threshold is overcome.

  • the problem could be analysed in terms of regression, having as output feature the GSD.

4 Conclusion

A timely forecast of the lightning activity may support early warning systems giving decision makers updated information to take the necessary safety measures during severe weather events. However, the complexity of atmospheric processes that lead to lightning activity makes the creation of such forecast still a difficult task. The present paper investigated the possibility of using ML techniques to develop an effective timely forecast of the lightning activity. Specifically, an application of RF has been presented to perform spatially explicit 1-h ahead nowcasting of lightning occurrence over the Italian national territory, including its surrounding seas. Since the best resolution on which to interpolate all the data used in the DF is a priori unknown, three models have been created, based on the three different horizontal spatial grids of the data used to train the model. The obtained results have been compared via typical evaluation metrics for extreme events, suggesting that the finest spatial resolution available increases the effectiveness of the model. Moreover, a comparative analysis on the available features has revealed the low importance of Latitude and DEM data with respect to meteorological ones describing the atmosphere, especially at higher altitudes. The encouraging results obtained in terms of forecasting Accuracy suggest how, after proper improvements, ML-based algorithms could find their place in wider early-warning systems to support disaster risk management procedures. For this reason, future work will involve the possibility of including further meteorological variables in the feature set, as well as moving from classification to regression algorithms to properly estimate the GSD over the AOI. This way, the present paper represents the preliminary study of a medium-term research whose final goal is the creation of an operative tool working in near real time to recognize and monitor unexpected and alarming changes in the lightning activity and thus, identifying the possibility of an extreme event in the following hours.