1 Introduction

In 2009, a book dedicated to the Turing Award winner Jim Gray introduced the idea of a fourth paradigm of science (Hey et al. 2009). The concept behind this new paradigm was based on a provocation: the existing paradigms, i.e. those of experimental science, theoretical science, and computational science, were, on the one hand, unsuitable for analyzing the increasing amounts of available data, while on the other hand, they were contributing to the generation of such masses of data. In this context, data is measured by instruments or generated by simulations before being processed by software and the resulting information or knowledge being saved in computers. As a result, scientists do not have access to their data until much later in the process. This data-intensive science demands the definition of new techniques and technologies that differ from those utilized in earlier scientific paradigms. Therefore, a new paradigm is required: the one of data science (Kotu and Deshpande 2019). In a broader sense, Data Science is a discipline that combines domain expertise, computer science skills, mathematical and statistical algorithms to transform data into actionable knowledge allowing to support and validate decisions as well as performing predictions.

While this new paradigm has found widespread application in a variety of scientific domains, progresses in geo-environmental sciences have been sluggish and have had a limited impact on our understanding of environmental, climatic, and social processes. This situation is exacerbated given the massive amount of data generated by earth-observing systems, in-situ observations (including crowd-sourced data), and climate-related models. To turn these data into valuable knowledge, scientists will have to cope with the so-called five V’s of big data: velocity, volume, value, variety and veracity. They will, however, also confront challenges due to the unique characteristics of geo-environmental data, the most evident one being linked to their spatial nature. In 1992, Carl Franklin estimated that about 80% of data included a spatial component (Franklin and Hane 1992). It can be imagined that this proportion has increased over time, as more data have been collected by satellites, mobile devices, and internet of things (IoT) tools. Furthermore, most of these data include a temporal dimension, often inextricably linked to the spatial one by the underlying physical process that the data represent. The analysis of these spatiotemporal data is rendered more complex by other compelling issues involving, among others: (i) data heterogeneity, due to different sources or multiple spatiotemporal scales; (ii) collection biases, due to clustered observation networks, the presence of undersampled/oversampled regions or short observational records; (iii) complex spatiotemporal dependencies, including lagged and long-distance relationships between variables. In addition, most of the common data analysis approaches are not designed for autocorrelated, heterogeneous, and non-linear socio-environmental data. For all of these reasons, mining geo-environmental data sets raises difficulties that are rarely addressed in other domains.

We believe that spatiotemporal data science will have a crucial role in addressing many of the great environmental and social challenges of the XXI century and of the other centuries to come. In this context, this special issue contributes to the open discussion on the methodological advances needed to analyze complex spatiotemporal processes. At the same time, several applied case studies highlight the potential of spatiotemporal data science in giving reliable solutions to real-world problems. Papers in this issue include case studies from hydrology and hydro-morphology (Budiman et al. 2021; Wang et al. 2021; Rolim et al. 2021; Tang et al. 2021; Tian et al. 2021; Sottile et al. 2021; Han and Morrison 2021), to geology (Giaccone et al. 2021), geomechanics (Li et al. 2021; Luo et al. 2021; Lombardo and Tanyas 2021; Aguilera et al. 2022; D’Angelo et al. 2022; Bryce et al. 2022; Grimm et al. 2022), atmospheric phenomena and renewable energy (La Fata et al. 2022; Amato et al. 2022; Kajbaf et al. 2022), pathogenic viruses and associated diseases (Niraula et al. 2022; Temple et al. 2022).