Abstract
Stable isotopes of hydrogen and oxygen are important natural tracers with a wide variety of environmental applications (e.g., the exploration of the water cycle, ecology and food authenticity). The spatially explicit predictions of their variations are obtained through various interpolation techniques. In the present work, a classical random forest (RF) and two of its variants were applied. RF and a random forest version employing buffer distance (RFsp) were applied to each month separately, while a random forest model was trained using all data employing month and year as categorical variables (RFtg). Their performance in predicting the spatial variability of precipitation stable oxygen isotope values for 2008–2017 across Europe was compared. In addition, a comparison was made with a publicly available alternative machine learning model which employs extreme gradient boosting. Input data was retrieved from the Global Network of Isotopes in Precipitation (GNIP; no. of stations: 144) and other national datasets (no. of stations: 127). Comparisons were made on the basis of absolute differences, median, mean absolute error and Lin’s concordance correlation coefficient. All variants were capable of reproducing the overall trends and seasonal patterns over time of precipitation stable isotope variability measured at each chosen validation site across Europe. The most important predictors were latitude in the case of the RF, and meteorological variables (vapor pressure, saturation vapor pressure, and temperature) in the case of the RFsp and RFtg models. Diurnal temperature range had the weakest predictive power in every case. In conclusion, it may be stated that with the merged dataset, combining GNIP and other national datasets, RFsp yielded the smallest mean absolute error 1.345‰) and highest Lin’s concordance correlation coefficient (0.987), while with extreme gradient boosting (based on only the GNIP data) the mean absolute error was 1.354‰, and Lin’s concordance correlation coefficient was 0.984, although it produced the lowers overall median value (1.113‰), while RFsp produced 1.124‰. The most striking systematic bias was observed in the summer season in the northern validation stations; this, however, diminished from 2014 onward, the point after which stations beyond 55° N are available in the training set.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The ratio between the heavy and light stable isotopes in the water molecule (18O/16O; 2H/1H) is an effective tool in resolving practical problems in environmental isotope geochemistry that arise in the disciplines of hydrology, climatology, biogeochemistry etc. (Coplen et al. 2000). The stable isotope composition of hydrogen and oxygen is conventionally expressed as δ values (δ2H and δ18O respectively) reported in units per mille (‰) (Coplen 1994). The stable isotopic composition of hydrogen and oxygen in precipitation (δp) provides an insight into the origin of water vapor, the conditions prevailing during condensation and precipitation (Aggarwal et al. 2016; Dansgaard 1964). Exploiting these variations, water (precipitation) stable isotopes have become important natural tracers in the study of the water cycle (Bowen and Good 2015; Fórizs 2003). With the continuous advancement in effectiveness and availability of analytical tools, the spatiotemporal abundance of precipitation stable isotope measurements is steadily increasing (Wassenaar et al. 2021; Yoshimura 2015), providing sufficient ground for the development of spatially continuous datasets of δp variability. These datasets can be utilized for advanced hydrological applications of precipitation stable isotopes where additional information can be gained from not only having point data, but spatially continuous information. Such applications can be found in hydrogeology (Bowen and Good 2015; Clark and Fritz 1997), limnology (Birkel et al. 2018; Nan et al. 2019), water resource management (Bowen and Good 2015; Gibson and Edwards 2002), the exploration of changes in moisture source conditions (Amundson et al. 1996), animal migration studies (Hobson 1999; Hobson and Wassenaar 1996), food source traceability (Heaton et al. 2008), and forensic sciences (Ehleringer et al. 2008) as well.
In generating spatially continuous datasets of isotopic variability, traditional kriging interpolation was demonstrated to outperform other approaches right from its inception (Bowen and Revenaugh 2003) and has since become the gold standard for mapping the spatial variability of precipitation isotopes on scales ranging from the global (e.g. Bowen 2010; Terzer et al. 2013) to the regional; e.g. Hatvani et al. (2020), Hatvani et al. (2017), Kaseke et al. (2016), Kern et al. (2014). In recent years, machine learning (ML) approaches (artificial neural networks, support vector machine and similar techniques) have increasingly proven their worth in extracting patterns and insights from the ever-increasing stream of geospatial data (Gopal 2016; Szatmári and Pásztor 2019), and have opened new perspectives on the understanding of data-driven Earth system science problems (Reichstein et al. 2019). Although the number of ML applications employed in deriving isoscapes is still scarce, their number is steadily increasing in fields as diverse as studies on bioavailable strontium (Bataille et al. 2020; Funck et al. 2021), on sulfur in human remains (Bataille et al. 2021), the stable isotope composition of nitrogen and carbon in particulate matter from the Northwest Atlantic Continental Shelf (Oczkowski et al. 2016), and studies on the isotopic composition of shallow groundwater (Stahl et al. 2020).
There have been comparative studies on the performance of different techniques in relation to predicting and mapping geochemical parameters (including stable isotopes), a task in which ML turned out to perform as well as combined geostatistical tools (Hengl et al. 2018), or even better than them (Bataille et al. 2018; Li et al. 2011). Specifically, a methodological comparison between a regression kriging approach and Random forest (RF) algorithm using daily and monthly δp datasets from European subregions showed that the ML approach is capable of outperforming the traditional methodology (Erdélyi et al. 2023). The first ever ML algorithm used in estimating the spatial variability of monthly δp in Europe was released recently (Nelson et al. 2021), employing extreme gradient boosting (XGBoost) algorithm called PisoAI that uses the Global Network for Isotopes in Precipitation (GNIP (IAEA 2019)) dataset. The performance of the different ML methods in predicting a spatially continuous dataset of monthly δp has not yet been compared. In addition, it is clear that there is still much to be learned about the behavior of ML tools in estimating the spatial variability of precipitation stable isotopes, and the difficulties and problems originating from the different spatiotemporal resolutions and predictors.
The aim of the present work is to explore the stable isotopic variability of precipitation across the European continent with the use of three variants of random forest algorithms. Specifically, to compare the performance of random forest approaches with the latest ML derived precipitation δ18O isoscape for Europe. In this task, not only the GNIP, but other networks’ data is used increasing the volume of the data by ~ 33%; which is in contrast to the usual approach in subcontinental precipitation stable isotope modeling.
2 Materials and methods
2.1 Data used
2.1.1 Precipitation isotopic data and preprocessing
Monthly δp values were acquired from a total of 480 monitoring stations operating in Europe and its vicinity between 1960 and 2021 (Fig. 1), primarily from the Global Network of Isotopes in Precipitation (GNIP), extended by the inclusion of 161 stations from regional networks and individual records; see Electronic supplementary material for details and the corresponding references. In general, the data gathered were unequally distributed in space (Fig. 1A) and time (Fig. 1B,C). Until the mid-1970s, the number of active precipitation isotope monitoring stations in Europe was limited, and those primarily belonged to the GNIP (n < 20). Afterwards, their number rose progressively, owing to the growth of the Austrian Network of Isotopes in Precipitation (ANIP) (Kralik et al. 2003), reaching about 80 stations for oxygen- (δ18O) (Fig. 1B), and 79 stations for hydrogen (δ2H) (Fig. 1C) precipitation stable isotopes by the turn of the millennium. A period with a relatively high abundance of active stations was observed in the early 2000s. For instance, in 2001 the number of active stations peaked at around 155 stations for δ18O and 137 for δ2H (Fig. 1B,C). However, the spatial distribution of the monitoring sites does not provide representative coverage of the continent in this period, being rather heavily concentrated in the Mediterranean region, owing to the coordinated research project of IAEA (IAEA 2005). Therefore, to maximize the spatiotemporal coverage of the δp values available from the study region, a focus period of 2008–2017 was chosen. This period was characterized by an annual average number of stations ≥ 125 providing data in any given year (Fig. 1B,C).
Since δ18O and δ2H are highly correlated in meteoric waters (Craig 1961; Rozanski et al. 1993) and the δ2H data are fewer in the collected dataset (Fig. 1B,C), precipitation δ18O (δ18Op) was used throughout the experiments presented here; the results, however, should be valid for both. If stations from different data sources (e.g. GNIP, ANIP, literature) were situated close to each other (within 3 km) and had parallel measurements with a difference of only < 0.05‰ for a given month, the GNIP data were discarded. For data preprocessing, local Moran statistics and deuterium-excess were considered in a way similar to Erdélyi et al. (2023). In the screening process no obvious outliers were found, therefore, no data were excluded.
2.1.2 Potential predictors of spatial variability of precipitation stable isotopes across Europe
Various environmental parameters are known to control the spatial variability of δp on continental scales (Dansgaard 1964; Rozanski et al. 1993). Physical factors affecting the transport and fractionation processes within the global hydrologic cycle that can be represented by the geographical position of a given location are known to be important drivers of δp (Bowen and Revenaugh 2003; Dansgaard 1964; Rozanski et al. 1993). In the present study latitude, longitude are in WGS84 coordinate system (EPSG 4326), elevation derived from the digital elevation model obtained from Amazon Terrain Tiles (zoom parameter 5: ground resolution ~ 3 km in average)(AWS 2021) were used. These geographical variables are, however, mere reflections of the actual drivers regulating the atmospheric mixing and fractionating processes, and these are, after all, environmental mechanisms which can be considered via physical/meteorological parameters. A wide variety of meteorological parameters influencing δp spatial variability were considered: vapor pressure, monthly mean temperature, monthly average of the diurnal temperature range, monthly precipitation amount, all of which were derived from the CRU TS4.05 0.5 × 0.5° resolution dataset (Harris et al. 2020), and saturation vapor pressure calculated (Murray 1966) from the CRU monthly mean temperature.
In addition, the distance matrix of the monitoring stations providing the training data, following the procedure of Hengl et al. (2018) were also included as a possible predictor. Specifically, the set of distances of a station from the others will serve as a predictor (called buffer distance (Hengl et al. 2018)) instead of its spatial coordinates (latitude and longitude), and this will be done for all the stations. Categorical variables were also included as the Köppen-Geiger climate region’s codes (KG; resolution 0.5 × 0.5°) (Kottek 2006), which reflect the climatic effects in a complex and aggregated way and have been used as such in other studies—e.g. (Heydarizad et al. 2021; Nelson et al. 2021)—and the month and year of a given measurement to reflect the well-known seasonality of δp (Feng et al. 2009). The geographical and meteorological predictors had been extracted to the grid cells locations using nearest neighbour resampling technique.
2.2 Methodology
2.2.1 Random forest
To predict the spatial variability of δp across Europe, random forest (RF) was used. Random forest is a non-parametric ML algorithm which assesses a combination of predictors, and in which each tree depends on the values of a random vector sampled independently, with the same distribution for all trees in the forest (Breiman 2001; Cutler et al. 2012). The generalization error of a forest is dependent on the strength of the individual regression trees in a forest, it converges to a limit as the number of trees in the forest increases (Breiman 2001). The forest’s predictions are based on the average results of the decision trees, which use bootstrap sampling to decrease the possibility of over-fitting; for details see e.g. Biau and Scornet (2016); Breiman (2001); Prasad et al. (2006). The approach was chosen, since it has been successfully applied to map isotopic parameters (e.g. Bataille et al. (2018)) and other environmental variables (Hengl et al. 2018) or predicting the biological status of water bodies (Szomolányi and Clement 2023).
Two alternative versions of random forest were employed in the study regarding the set of predictors for the focus period (2008–2017). In the RF approach, all predictors were included except for the distances between the grid cells and the monitoring stations, while in the so-called RFsp (Hengl et al. 2018) the distances of the monitoring stations from the grid cells were also used as predictors. For these two models (RF and RFsp), random forest models were calculated separately for each monthly dataset, and those models were used to make the predictions.
In the case of another variant of random forest, the whole dataset (all monthly data together from 2008 to 2017) is used, with the years and months as additional categorical predictors, as the training data from which to derive the model. Following this, the monthly isoscapes are predicted (marked as RFtg). In the meanwhile, in the case of both RF and the RFsp, the training and prediction is performed on each and every monthly dataset separately. In all random forest approaches the so-called ‘variable importance’ sequence is obtained, representing the relative importance of each predictor in a random forest model.
In the present case the randomForestSRC (Ishwaran et al. 2021) and ranger (Wright and Ziegler 2015) R (R Core Team 2019) packages were used to obtain the RF and RFsp models, respectively. The specific parameter values for e.g., nodesize and mtry were determined using the tuneRanger() function following Probst et al. (2019) using the out-of-bag observations to evaluate the trained algorithm (Breiman 1996).
Evaluation of our model results consisted of out-of-sample verification, and comparison to another ML-based spatial prediction of δ18Op variability across Europe (Nelson et al. 2021). To see the possible effect of a larger dataset in space, training was performed on (i) the GNIP data only—overlapping with the only available ML alternative for δp predictions in Europe (Nelson et al. 2021)—and (ii) the entire collected dataset (Fig. 1A, B).
When evaluating the importance of the predictors used, variable importance plots were derived using mean decrease in impurity (Gini index; Breiman (2017)), which shows each variable’s selection frequency for a split during the building of a tree (Szomolányi and Clement 2023). The variable importance plots were plotted in such a way that their information would be rendered comparable to the various random forest approaches. Therefore, all the weights were converted to percentages, and only the first 11 predictors’ explanatory power was evaluated (Sect. 3, Fig. 2).
2.2.2 Validation of the random forest monthly models of δ p variability
For validation purposes, 10% of the available data (1582 cases) were retained from each month with Basel and Rovaniemi (Figs. 3 and 4) always included, since these two were used for validation in the development of the PisoAI model (Nelson et al. 2021). This semi-random selection resulted in 14 validation stations overall (Figs. 3 and 4). It was ensured that the chosen test stations would not be closer to each other than 60 km, to avoid clustering of the validation points, and required that the amount of missing monthly data was below 25% of the total time considered (i.e., between 2008 and 2017).
In the validation process numerous statistics were used. Absolute differences were calculated to explore the performance of the random forest approaches and compare it to actual measured δp values and the PisoAI model. In all cases these were obtained as the difference of the measured value of a validation station and its nearest grid cells values to where the different ML models predicted their results. The median of the absolute differences (MAD) and the mean absolute errors (MAE) were calculated to compare the model performances. MAE measures the average absolute difference between the predicted values and the actual values in a dataset. It is a commonly used error statistic instead of, for example, root mean square error, when the error distribution is not expected to be Gaussian (Chai and Draxler 2014), as in the present case (W(1529) = 0.882, p < 0.001;(Shapiro and Wilk 1965)). Also, MAE is suggested to avoid the amplification of the higher errors compared to the lower ones (Willmott and Matsuura 2005).
Additionally, the Lin's concordance correlation coefficient (Lin’s CCC, Lawrence 1989) a hybrid error metric was included. The CCC combines measures of both precision and accuracy to determine how far the predicted and measured data pairs deviate from the line of perfect concordance (y = x line). Lin’s CCC ranges from − 1 to 1, with a perfect concordance at 1. Lin’s CCC was calculated using the CCC() function of the DescTools package, (Signorell et al. 2019).
3 Results and discussion
First, the importance of the applied predictors is shown and evaluated, followed by the assessment of the performance in time. As a last step, the overall performance of the different models is compared and discussed.
3.1 Importance of predictors
When a random forest model is built, out-of-bag (OOB) samples that are not used during the construction are attributed to all the trees of the model. These OOB samples are then used to calculate the importance of a specific variable, in the present case, this means to get an insight into the controlling mechanisms of the spatial variability of precipitation stable isotopes. There is no change in the order of the predictor importance among the most powerful ones, and there is only a slight change in the lower importance ranks (Fig. 2). Only the power of KG variable drops explicitly when the amount of data is increased (Fig. 2A–C vs. D–F).
In the present case, the most important predictors were the meteorological variables (vapor pressure, saturation vapor pressure, and temperature) in the complex random forest models (RFtg and RFsp), as has been shown in the case of the PisoAI model using the XGBoost algorithm (Nelson et al. 2021). Diurnal temperature range, however, had the weakest predictive power in every case. This was true regardless of whether only the GNIP data (Fig. 2A–C) or all the available data (Fig. 2D–F) were used.
The most striking difference between the more powerful and less powerful predictors is seen in the case of the RFtg model (Fig. 2B, E), while this is (considerably) less so in the case of the classical RF model. Of the geographical variables, latitude was in every model among the most important drivers (Fig. 2). In the RF model it was the leading one (Fig. 2A,D), concurring with past studies in which latitude was the crucial predictor (Bowen and Revenaugh 2003) or was again amongst the most important ones (Hatvani et al. 2017; Liu et al. 2008; Terzer-Wassmuth et al. 2021; Terzer et al. 2013). In the RFsp model, the fourth most important predictor is S_438, which is one of the training station’s (Kuopio) distance to the grid points. There is no change in the order of the predictor importance among the most powerful ones, and there is only a slight change in the ranks of lower importance (Fig. 2).
3.2 Performances evaluated in time
The results unanimously indicate that all random forest variants are capable of reproducing the overall trends and seasonal patterns of measured precipitation stable isotope variability at each validation site across Europe. This—namely, capturing the amplitude of seasonal variations in δ18Op—however, had been a continuous challenge for previous regression based geospatial models, e.g. Cluett and Thomas (2020), Daniels et al. (2017), Shi et al. (2020), and in this respect ML approaches seem to outperform their regression-based counterparts (Nelson et al. 2021).
The predicted time series of δ18Op values from the validation sites (Fig. 3) showed that the random forest models generally tracked the seasonal variability of the measured values well in almost every case. A relatively small variance in δ18Op is seen at the coastal locations (Fig. 3A,B; for localities see Fig. 4C), while more pronounced seasonal cycles prevail inland (Fig. 3C–N; for localities see Fig. 4C). There is no single model that could be considered to outperform the others at most of the stations.
Generally, the r2 values ranged from 0.26 (Cestas—Fig. 3B) to 0.87 (Basel—Fig. 3F). Spatial clustering can be observed in the case of the linear relationships between the validation stations’ time series and the predicted ones. The weakest correlations (0.26 < r2 < 0.7; median r2 = 0.5) were seen in the case of the SW stations (A Coruña, Cestas, Tortosa, Girona; Fig. 3A–D), while higher ones (0.66 < r2 < 0.87; median r2 = 0.78) characterized the Central European validation stations (Fig. 3E–L). Somewhat weaker r2 values were recorded for the northernmost validation stations (Fig. 3M,N) ranging between 0.64 and 0.74 with a median of 0.69. When comparing the r2 values between the different models, it becomes clear that in the case of the SW validation stations the PisoAI produced the best estimations (Fig. 3A–D). Regarding the Central European stations, the RFsp gave the best predictions in four- and the PisoAI in three out of the eight cases (Fig. 3E–L). For the northernmost stations: at Espoo the PisoAI (r2 = 0.85; Fig. 3M) while at Rovaniemi the RF approach gave the highest correlations (r2 = 0.68; Fig. 3N).
Intra-annual/seasonal variability is generally well reproduced, for example, even if a winter month is less negative than preceding and following one(s), the models were in fact able to mimic the sub-seasonal pattern, e.g., 2012 Nyon, 2016/2017 Espoo. Similar accuracy was seen in case of even smaller variations in the summer months of 2014 and 2015 at St. Gallen, as well, for example.
Regarding the interannual/interseasonal differences, if a winter was more negative than the ones generally occurring at a particular station, the models still gave quite accurate predictions. For example, in Nyon, in the winter of 2014/2015, the relative difference between the δ18Op negative peaks (near − 5‰) was well reproduced compared to the following winter season, as also in the case of the Carpathian monitoring station (Liptovsky Mikulas; Fig. 3L). There, in the Decembers of 2010 and 2012, δ18Op < − 18‰, while in early winter 2016 neither the observed, nor did the modelled δ18Op drop below − 14‰, all of which again highlights the estimation power of the ML tools for δ18Op. A summer example can be drawn from Basel and Graz, where in 2010 and 2013 the summer peak was below − 5‰, while in 2015 it was above − 5‰, and the models were able to reproduce the interseasonal differences.
Despite the general and reasonable agreement between the observed and modelled time series of δ18Op, deviations are also observable. These will be elaborated on in the following order: (i) systematic misfits, (ii) multimothly-periods, (iii) monthly anomalies/single peaks.
The most obvious systematic misfit can be observed in summer at the northern validation stations (Fig. 3M, N). The Random Forest algorithms predicted more positive values (the largest errors > 4‰) for the summer month of 2009. The degree of misfit seems to decrease from 2014 (Fig. 3M, N). The improvement of the predictions, especially at Espoo, may be supposed to be a benefit of the entering of data from closer locations to the training set. Since the contribution to the GNIP database from two Estonian stations (Tartu, Vilsandi) started in 2013. Another season-specific bias can be seen in A Coruña (Fig. 3A), where the Random Forest prediction underestimates the observed δ18Op values. The largest errors, > 7‰, appear for the summer months of 2011 and 2015.
Another type of problem is when multimothly-periods with generally poor estimations occur. For example, in Cestas from late 2013 to early 2014, all models fail to match the measured data (Fig. 3B). However, the measured data itself displays an unusual, weird pattern, showing values for winter which are less negative and ones for summer which are more negative, contradicting the expected seasonality for a midlatitude location (Feng et al. 2009; Rozanski et al. 1993) and documented for nearby French stations (Zhang et al. 2020) even in this particular interval. This raises concerns about the reliability of the Cestas δ18Op record for this time interval. Another example is Graz (Fig. 3K), where from mid-2011 to early 2012, the models were unable to match the repeatedly occurring high values (> − 5‰) in winter, which were higher than the typical level of the summer δ18Op records at this site (Fig. 3K).
Although the intra-seasonal pattern was in general well reflected in the model data (see above), still, occasionally occurring cases of monthly anomalies/single peaks are seen in which the amplitude of the prediction(s) fall behind the measurements. These typically occur in cases where negative values in winter are involved, for example, in February 2010 in Rovaniemi (Fig. 3N), November 2013 in Tortosa (Fig. 3C), and November 2015 in Graz (Fig. 3K).
3.3 Overall performances
Since the empirical inspection of the δ18Op time series did not reveal explicit differences between the models’ ability to reproduce the actual observations, the monthly absolute differences were considered together at each validation stations and plotted on box-and-whiskers plots for further analysis (Fig. 4A,B). This was done for the cases when only the GNIP data (Fig. 4A) and then all data were used in the training procedure (Fig. 4B); also see Sect. 2.2.
In case of the RFsp, (i) the median value of the absolute differences was smallest in 6 out of 14 cases considering only the GNIP data for training (Fig. 4A), and 7 out of 14 times considering all data (Fig. 4B). Meanwhile, (ii) the interquartile ranges for this particular model were smallest 4 out of 14 times with only GNIP data (Fig. 4A), and 9 out of 14 times with all data (Fig. 4B), and (iii) the upper fence maximum value was the lowest 6 out of 14 times with only GNIP data (Fig. 4A) and 9 out of 14 times with all data (Fig. 4B).
Comparing this performance to the only alternative yet to be published in the specific field of precipitation isotope composition prediction (PisoAI, Nelson et al. 2021), (i) the median value of absolute differences was the smallest 8 out of 14 times with only GNIP data (Fig. 4A), and 6 out of 14 times when all the stations were considered (Fig. 4B). (ii) The interquartile ranges were the smallest 8 out of 14 times with only GNIP data (Fig. 4A), and 5 out of 14 times with all data (Fig. 4B). Lastly, (iii) the upper fence was the lowest 6 out of 14 times with only GNIP data (Fig. 4A, and 5 out of 14 times with all data (Fig. 4B). When not only the GNIP, but all the stations available for the present study (Fig. 1) were considered, RFsp proved to provide more accurate predictions in 7 out of 14 cases (Fig. 4B). It has to be noted that in this case random forest models were trained on a larger set compared to PisoAI (Nelson et al. 2021), which was trained exclusively on GNIP data.
The worst median values were produced by the RF model when dealing with only GNIP (Fig. 4A), and the RFtg model when all data were used (Fig. 4B). There is no clear indication that the Random Forest models created by merging all the monthly datasets (RFtg) would produce better predictions than the versions calculated for every month. Among the validation stations, the most striking deviations were seen at Rovaniemi and A Coruña primarily in summer (Fig. 3A,N and Sect. 3.1). These stations are situated at a marginal position with a relative scarcity nearby training stations (Figs. 1A, 4C). Thus, without an abundance of similar training stations, the random forest algorithm struggles reproducing the empirical degree of the latitudinal effect (Dansgaard 1964).
In general, with the inclusion of additional sites in the training set compared to the GNIP data, the predictive accuracy of the random forest approaches increased in 12, 7 and 10 cases out of the 14 for the RF, RFtg and RFsp variants, respectively (Fig. 4A vs. 4B: MAE values). The biggest improvement was achieved at the validation stations located in the eastern part of the Alps (Kufstein, Wildalpen and Graz; Fig. 4C), where in fact the presence of the ANIP network (Fig. 1A: red dots) increased the total abundance to the greatest degree (Fig. 1B, C). The difference in predictive accuracy (MAD) ranged from 0.73 to 2.87‰; Fig. 4B vs. A.
No comparable improvement in MAD was observed for the validation stations in the western Alpine Region (Nyon, Basel, St. Gallen). A characteristic example is the Basel station with the addition of the ANIP data; still, the MAD increased to between 0.06 and 0.16 ‰ for all random forest variants (Fig. 4A, B). This region neighbors the ANIP network (Figs. 1 and 4C), thus some degree of improvement might have been expected. A similar situation is observed at the northernmost validation stations, where the most promising random forest variant gave poorer estimations (Rovaniemi and Espoo) when the model was trained on an extended dataset compared to the GNIP version (Figs. 4A, B).
To assess the overall performance—regardless of the spatial location(s)—of the tested ML models, the MAD, MAE and Lin’s concordance correlation coefficient values were calculated for the 14 validation stations were averaged (Table 1). In this respect, the PisoAI (MAD = 1.113‰) performed better by 0.2‰ than the most promising random forest variant (RFsp—MAD = 1.299‰), when only the GNIP data were considered; with the complete database, the RFsp method gave nearly as low MAD value (1.124‰). In case of the MAE values, RFsp resulted the lowest value (1.345‰) when all the data were included for the analysis. The highest score for the Lin’s CCC was provided by the RFsp variant when only the GNIP network data considered (Lin’s CCC RFsp GNIP only = 0.986), and further improved when additional stations (all data) were involved to the training set (Lin’s CCC RFsp all data = 0.987). Those results corroborates the previous findings, namely, that with an extended dataset the RFsp approach provides more accurate results than the PisoAI, but when only the GNIP database is used the performance of the methods remains comparable (Figs. 3 and 4; Table 1).
4 Conclusions and outlook
In the study the performance of four machine learning algorithms were compared on monthly δ18Op records from 2008 to 2017. The methods were random forest (RF), random forest employing buffer distance (RFsp), a random forest model trained using all data employing month and year as categorical variables (RFtg), and lastly, the results of a previously published model employing XGBoost (Nelson et al. 2021). The accuracy of the predictions was tested on all the available data and a subset consisting of only GNIP data. Using only the GNIP data, similar results were obtained regardless of which random forest variant was employed, but these slightly underperformed compared to the XGBoost approach. However, the inclusion of additional network data in the training set further improved the results for all random forest models (Table 1). In this respect, the model employing buffer distance RFsp gave the smallest overall error amongst the random forest variants used; indeed, it was even better than that provided by the XGBoost. The median value of the absolute differences, their interquartile ranges and upper fence maximum values were all lower for a higher number of validation stations in the case of the RFsp model than in the case of the PisoAI approach (Fig. 4A, B). The three leading predictors were vapor pressure, saturation vapor pressure, and temperature in the complex random forest models (RFtg and RFsp), a finding similar to that obtained using the PisoAI model with the XGBoost algorithm (Nelson et al. 2021). It was an obvious, yet important, observation that subregions represented by only a few stations (especially in northern Europe) benefit from the inclusion of additional data in the training dataset.
It is suggested that the next step in the research should be to increase the dataset by an even greater extent and perform comprehensive tests using the best performing methods, complemented by the XGBoost algorithm (Nelson et al. 2021) and random forest combined with Lasso regression (Erdélyi et al. 2023) which allows extrapolation outside the data boundaries (Zhang et al. 2019).
Gaining an in-depth knowledge of how machine learning methods behave under different circumstances (spatiotemporal data density, various predictors etc.) in predicting precipitation stable isotopes on continental scales is a necessary step. It is our expectation that in the near future, classical geostatistical approaches will be overtaken by geomathematical ML tools in the derivation of the next generation of (precipitation) isoscapes.
References
Aggarwal, P.K., et al.: Proportions of convective and stratiform precipitation revealed in water isotope ratios. Nat. Geosci. 9, 624 (2016). https://doi.org/10.1038/ngeo2739
Amundson, R., Chadwick, O., Kendall, C., Wang, Y., DeNiro, M.: Isotopic evidence for shifts in atmospheric circulation patterns during the late Quaternary in mid–North America. Geology 24, 23–26 (1996)
AWS (2021) Terrain Tiles. Amazon. https://registry.opendata.aws/terrain-tiles. Accessed 10 Dec 2021
Bataille, C.P., Crowley, B.E., Wooller, M.J., Bowen, G.J.: Advances in global bioavailable strontium isoscapes. Palaeogeogr. Palaeoclimatol. Palaeoecol. 555, 109849 (2020). https://doi.org/10.1016/j.palaeo.2020.109849
Bataille, C.P., Jaouen, K., Milano, S., Trost, M., Steinbrenner, S., Crubézy, É., Colleter, R.: Triple sulfur-oxygen-strontium isotopes probabilistic geographic assignment of archaeological remains using a novel sulfur isoscape of western Europe. PLoS ONE 16, e0250383 (2021). https://doi.org/10.1371/journal.pone.0250383
Bataille, C.P., von Holstein, I.C.C., Laffoon, J.E., Willmes, M., Liu, X.-M., Davies, G.R.: A bioavailable strontium isoscape for Western Europe: A machine learning approach. PLoS ONE 13, e0197386 (2018). https://doi.org/10.1371/journal.pone.0197386
Biau, G., Scornet, E.: A random forest guided tour. TEST 25, 197–227 (2016). https://doi.org/10.1007/s11749-016-0481-7
Birkel, C., et al.: Characterization of surface water isotope spatial patterns of Scotland. J. Geochem. Explor. 194, 71–80 (2018). https://doi.org/10.1016/j.gexplo.2018.07.011
Bowen, G.J.: Isoscapes: spatial pattern in isotopic biogeochemistry. Annu. Rev. Earth Planet. Sci. 38, 161–187 (2010). https://doi.org/10.1146/annurev-earth-040809-152429
Bowen, G.J., Good, S.P.: Incorporating water isoscapes in hydrological and water resource investigations WIREs. Water 2, 107–119 (2015). https://doi.org/10.1002/wat2.1069
Bowen, G.J., Revenaugh, J.: Interpolating the isotopic composition of modern meteoric precipitation. Water Resour. Res. 39, 1299 (2003). https://doi.org/10.1029/2003WR002086
Breiman, L.: Out-of-Bag Estimation. University of California, Department of Statistics (1996)
Breiman, L.: Bagging predictors. Random for Mach Learn 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Breiman L (2017) Classification and Regression Trees. 1st edn. Routledge, New York https://doi.org/10.1201/9781315139470
Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature. Geosci Model Dev 7, 1247–1250 (2014). https://doi.org/10.5194/gmd-7-1247-2014
Clark, I.D., Fritz, P.: Environmental Isotopes in Hydrogeology. Taylor and Francis (1997)
Cluett, A.A., Thomas, E.K.: Resolving combined influences of inflow and evaporation on western Greenland lake water isotopes to inform paleoclimate inferences. J. Paleolimnol. 63, 251–268 (2020). https://doi.org/10.1007/s10933-020-00114-4
Coplen, T.B.: Reporting of stable hydrogen, carbon and oxygen isotopic abundances. Pure App Chem 66, 273–276 (1994)
Coplen, T.B., Herczeg, A.L., Barnes, C.: Isotope engineering—using stable isotopes of the water molecule to solve practical problems. In: Cook, P.G., Herczeg, A.L. (eds.) Environmental Tracers in Subsurface Hydrology. Springer US, Boston, MA, pp. 79–110. (2000) https://doi.org/10.1007/978-1-4615-4557-6_3
Craig, H.: Isotopic Variations in Meteoric Waters. Science 133, 1702–1703 (1961). https://doi.org/10.1126/science.133.3465.1702
Cutler, A., Cutler, D.R., Stevens, J.R.: Random Forests. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning: Methods and Applications. Springer US, Boston, MA, pp. 157–175 (2012) doi:https://doi.org/10.1007/978-1-4419-9326-7_5
Daniels, W.C., Russell, J.M., Giblin, A.E., Welker, J.M., Klein, E.S., Huang, Y.: Hydrogen isotope fractionation in leaf waxes in the Alaskan Arctic tundra. Geochim. Cosmochim. Acta 213, 216–236 (2017). https://doi.org/10.1016/j.gca.2017.06.028
Dansgaard, W.: Stable isotopes in precipitation. Tellus 16, 436–468 (1964)
Ehleringer, J.R., Bowen, G.J., Chesson, L.A., West, A.G., Podlesak, D.W., Cerling, T.E.: Hydrogen and oxygen isotope ratios in human hair are related to geography. Proc. Natl. Acad. Sci. 105, 2788–2793 (2008). https://doi.org/10.1073/pnas.0712228105
Erdélyi, D., Hatvani, I.G., Jeon, H., Jones, M., Tyler, J., Kern, Z.: Predicting spatial distribution of stable isotopes in precipitation by classical geostatistical- and machine learning methods. J. Hydrol. 617(Part C), 129129 (2023). https://doi.org/10.1016/j.jhydrol.2023.129129
Feng, X., Faiia, A.M., Posmentier, E.S.: Seasonality of isotopes in precipitation: a global perspective. J. Geophys. Res. Atmos. (2009). https://doi.org/10.1029/2008JD011279
Fórizs, I.: Isotopes as natural tracers in the watercycle: examples from the carpathian basin. Studia UBB Physica 1, 69–77 (2003)
Funck, J., Bataille, C., Rasic, J., Wooller, M.: A bio-available strontium isoscape for eastern Beringia: a tool for tracking landscape use of Pleistocene megafauna. J. Quat. Sci. 36, 76–90 (2021). https://doi.org/10.1002/jqs.3262
Gibson, J.J., Edwards, T.W.D.: Regional water balance trends and evaporation-transpiration partitioning from a stable isotope survey of lakes in northern Canada. Glob. Biogeochem. Cycl. 16, 10-11–10-14 (2002). https://doi.org/10.1029/2001GB001839
Gopal, S.: Artificial Neural Networks in Geospatial Analysis. In: Richardson, D., Castree, N., Goodchild, M.F., Kobayashi, A., Liu, W., Marston, R.A. (eds.) International Encyclopedia of Geography. John Wiley & Sons, Ltd., pp. 1–7. (2016) https://doi.org/10.1002/9781118786352.wbieg0322
Harris, I., Osborn, T.J., Jones, P., Lister, D.: Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset. Scientific Data 7, 109 (2020). https://doi.org/10.1038/s41597-020-0453-3
Hatvani, I.G., Erdélyi, D., Vreča, P., Kern, Z.: Analysis of the spatial distribution of stable oxygen and hydrogen isotopes in precipitation across the Iberian Peninsula. Water 12, 481 (2020). https://doi.org/10.3390/w12020481
Hatvani, I.G., Leuenberger, M., Kohán, B., Kern, Z.: Geostatistical analysis and isoscape of ice core derived water stable isotope records in an Antarctic macro region. Polar Sci. 13, 23–32 (2017). https://doi.org/10.1016/j.polar.2017.04.001
Heaton, K., Kelly, S.D., Hoogewerff, J., Woolfe, M.: Verifying the geographical origin of beef: The application of multi-element isotope and trace element analysis. Food Chem. 107, 506–515 (2008). https://doi.org/10.1016/j.foodchem.2007.08.010
Hengl, T., Nussbaum, M., Wright, M.N., Heuvelink, G.B., Gräler, B.: Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ 6, e5518 (2018)
Heydarizad, M., Gimeno, L., Sorí, R., Minaei, F., Mayvan, J.E.: The stable isotope characteristics of precipitation in the middle east highlighting the link between the Köppen climate classifications and the δ18O and δ2H values of precipitation. Water 13, 2397 (2021)
Hobson, K.A.: Tracing origins and migration of wildlife using stable isotopes: a review. Oecologia 120, 314–326 (1999). https://doi.org/10.1007/s004420050865
Hobson, K.A., Wassenaar, L.I.: Linking breeding and wintering grounds of neotropical migrant songbirds using stable hydrogen isotopic analysis of feathers. Oecologia 109, 142–148 (1996). https://doi.org/10.1007/s004420050068
IAEA (2005) Isotopic composition of precipitation in the Mediterranean Basin in relation to air circulation patterns and climate. TECDOC Series, vol 1453. International Atomic Energy Agency, Vienna, Austria
IAEA (2019) Global Network of Isotopes in Precipitation. The GNIP Database. http://www.isohis.iaea.org. Accessed 12.11.2019
Ishwaran, H., Kogalur, U., Kogalur, M. RandomForestSRC: fast unified random forests for survival regression, and classification (RF-SRC) (2021) [(accessed on 16 July 2020)]
Kaseke, K.F., Wang, L., Wanke, H., Turewicz, V., Koeniger, P.: An analysis of precipitation isotope distributions across namibia using historical data. PLoS ONE 11, e0154598 (2016). https://doi.org/10.1371/journal.pone.0154598
Kern, Z., Kohán, B., Leuenberger, M.: Precipitation isoscape of high reliefs: interpolation scheme designed and tested for monthly resolved precipitation oxygen isotope records of an Alpine domain. Atmos. Chem. Phys. 14, 1897–1907 (2014). https://doi.org/10.5194/acp-14-1897-2014
Kottek, M.G., Grieser, J., Beck, C., Rudolf, B., Rubel, F.: World Map of the Köppen-Geiger climate classification updated. Meteorol. z. 15, 259–263 (2006). https://doi.org/10.1127/0941-2948/2006/0130
Kovács, J., Tanos, P., Korponai, J., Székely, I.K., Gondár, K., Gondár-Sőregi, K., Hatvani, I.G.: Analysis of Water Quality Data for Scientists. In: Voudouris, K., Voutsa, D. (eds.) Water quality monitoring and assessment. InTech, pp 65–94 (2012) https://doi.org/10.5772/32173
Kralik, M., Papesch, W., Stichler, W.: Austrian Network of Isotopes in Precipitation (ANIP): Quality assurance and climatological phenomenon in one of the oldest and densest networks in the world. Isotope Hydrol. Integr. Water Resour. Manage. 146–149 (2003)
Lawrence, I.K.L.: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255–268 (1989). https://doi.org/10.2307/2532051
Li, J., Heap, A.D., Potter, A., Daniell, J.J.: Application of machine learning methods to spatial interpolation of environmental variables. Environ. Model. Softw. 26, 1647–1659 (2011). https://doi.org/10.1016/j.envsoft.2011.07.004
Liu, Z., Tian, L., Chai, X., Yao, T.: A model-based determination of spatial variation of precipitation δ18O over China. Chem. Geol. 249, 203–212 (2008). https://doi.org/10.1016/j.chemgeo.2007.12.011
Murray, F.W.: On the computation of saturation vapor pressure. Rand Corp Santa Monica Calif (1966)
Nan, Y., Tian, F., Hu, H., Wang, L., Zhao, S.: Stable isotope composition of river waters across the world. Water 11, 1760 (2019)
Nelson, D.B., Basler, D., Kahmen, A.: Precipitation isotope time series predictions from machine learning applied in Europe. Proc Natl Acad Sci 118, e2024107118 (2021). https://doi.org/10.1073/pnas.2024107118
Oczkowski, A., Kreakie, B., McKinney, R.A., Prezioso, J.: Patterns in stable isotope values of nitrogen and carbon in particulate matter from the Northwest Atlantic continental shelf, from the gulf of maine to cape hatteras. Front Mar Sci (2016). https://doi.org/10.3389/fmars.2016.00252
Prasad, A.M., Iverson, L.R., Liaw, A.: Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9, 181–199 (2006). https://doi.org/10.1007/s10021-005-0054-1
Probst, P., Wright, M.N., Boulesteix, A.-L.: Hyperparameters and tuning strategies for random forest. Wires Data Min Knowl Discov 9, e1301 (2019). https://doi.org/10.1002/widm.1301
R Core Team (2019) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N.: Deep learning and process understanding for data-driven earth system science. Nature 566, 195–204 (2019). https://doi.org/10.1038/s41586-019-0912-1
Rozanski, K., Araguás-Araguás, L., Gonfiantini, R.: Isotopic patterns in modern global precipitation. In: Swart, P.K., Lohmann, K.C., Mckenzie, J., Savin, S. (eds.) Climate Change in Continental Isotopic Records. American Geophysical Union, USA, pp. 1–36. (1993) https://doi.org/10.1029/GM078p0001
Shapiro, S.S., Wilk, M.B.: An analysis of variance test for normality (complete samples). Biometrika 52, 591–611 (1965). https://doi.org/10.1093/biomet/52.3-4.591
Shi, Y., Wang, S., Zhang, M., Argiriou, A.A., Guo, R., Song, Y., Zhu, X.: Isoscape of δ18O in Precipitation of the Qinghai-Tibet Plateau: Assessment and Improvement. Water 12, 3392 (2020)
Signorell, A. et al.: DescTools: Tools for Descriptive Statistics, vol 28 (2019)
Stahl, M.O., Gehring, J., Jameel, Y.: Isotopic variation in groundwater across the conterminous United States – Insight into hydrologic processes. Hydrol. Process. 34, 3506–3523 (2020). https://doi.org/10.1002/hyp.13832
Szatmári, G., Pásztor, L.: Comparison of various uncertainty modelling approaches based on geostatistics and machine learning algorithms. Geoderma 337, 1329–1340 (2019). https://doi.org/10.1016/j.geoderma.2018.09.008
Szomolányi, O., Clement, A.: Use of random forest for assessing the effect of water chemistry parameters on the biological status of surface waters. Int. J. Geomath. same topical collection IJGE-D-22-00053 (2023)
Terzer-Wassmuth, S., Wassenaar, L.I., Welker, J.M., Araguás-Araguás, L.J.: Improved high-resolution global and regionalized isoscapes of δ18O, δ2H and d-excess in precipitation. Hydrol. Process. 35, e14254 (2021). https://doi.org/10.1002/hyp.14254
Terzer, S., Wassenaar, L.I., Araguás-Araguás, L.J., Aggarwal, P.K.: Global isoscapes for δ18O and δ2H in precipitation: improved prediction using regionalized climatic regression models. Hydrol. Earth Syst. Sci. 17, 4713–4728 (2013). https://doi.org/10.5194/hess-17-4713-2013
Wassenaar, L., Terzer-Wassmuth, S., Douence, C.: Progress and challenges in dual- and triple-isotope (δ18O, δ2H, Δ17O) analyses of environmental waters: an international assessment of laboratory performance. Rapid Commun. Mass Spectrom. 35, e9193 (2021). https://doi.org/10.1002/rcm.9193
Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Res. 30, 79–82 (2005)
Wright MN, Ziegler A (2015) ranger: A fast implementation of random forests for high dimensional data in C++ and R. arXiv preprint arXiv:150804409
Yoshimura, K.: Stable water isotopes in climatology, meteorology, and hydrology: a review. J Meteorol Soc Jpn Ser II 93, 513–533 (2015). https://doi.org/10.2151/jmsj.2015-036
Zhang H, Nettleton D, Zhu Z (2019) Regression-enhanced random forests. arXiv preprint arXiv:190410416
Zhang, J., Genty, D., Sirieix, C., Michel, S., Minster, B., Régnier, E.: Quantitative assessments of moisture sources and temperature governing rainfall δ18O from 20 years' monitoring records in SW-France: Importance for isotopic-based climate reconstructions. J. Hydrol. 591, (2020). https://doi.org/10.1016/j.jhydrol.2020.125327
Acknowledgements
On behalf of project 'Limitations of spectral analysis of sedimentary proxy records, tested on simulated time series with timescale error and variable temporal resolution' we thank for the usage of ELKH Cloud (https://science-cloud.hu/) which helped us achieve the results published in this paper. This research has been supported by the National Research, Development and Innovation Office (grant SNN143868). This is contribution no. 85 of the 2ka Palæoclimatology Research Group.
Funding
Open access funding provided by ELKH Research Centre for Astronomy and Earth Sciences.
Author information
Authors and Affiliations
Contributions
ZK and DE conceived and designed the study with contribution from IGH. DE performed the analysis and produced the figures. IGH, ZK and DE wrote the paper, with contributions from TN. The FLAE approach was applied to the sequence of authors; see https://doi.org/10.1371/journal.pbio.0050018. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Erdélyi, D., Kern, Z., Nyitrai, T. et al. Predicting the spatial distribution of stable isotopes in precipitation using a machine learning approach: a comparative assessment of random forest variants. Int J Geomath 14, 14 (2023). https://doi.org/10.1007/s13137-023-00224-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13137-023-00224-x