1 Introduction

In the last decade, the increasing progress of modern technologies allows to handle large complex and high-dimensional spatio-temporal data. Such data is often based on dense sampling schemes of observations over space, time, and over other continuum measures. Accompanying this growth of (high quality, informative and big) data, there has been a rising scientific interest in new statistical methods able to handle and analyse such open problems.

Although classical multivariate statistical techniques can still be applied to this kind of data, they do not capture the additional information coming from the underlying generating process that underpins the data. Spatial functional statistics is a recent research area combining together the well-developed branches of functional statistics and spatial statistics, showing ability to analyse such complex, multivariate data. It has been developed in the framework of the Functional Data Analysis (FDA) paradigm [Ramsay and Silverman (2005); Ferraty and Vieu (2006)] by taking into account the spatial structure of the data. FDA includes methods and theory for data coming in form of functions, and spatial functional statistics extend this approach to deal with samples of functions recorded at different locations of a region (the so-called spatially correlated functional data), or functions observed over a discrete set of time points (temporally correlated functional data).

In contrast to other methods used to analyse spatio-temporal data, spatio-functional techniques enlarge spatial techniques making no parametric assumptions about time effects. As happens with FDA, spatial functional statistics can also be seen as a different way of thinking, where the basic unit of information is the entire observed function rather than a string of numbers with the additional condition related to the spatial component. In particular, the emerging characteristics about the developed methods are mainly related to the several spatial data structures (geostatistical data, point patterns and areal data) that can be combined with functional data. Thus, as it happens in univariate or multivariate spatial data analysis, methods developed in spatial functional statistics can be largely classified into two main categories: geostatistical functional methods, and purely spatial functional methods.

Geostatistical functional methods aim to describe the spatial correlation of functional phenomena and provide adaptations of regression techniques to account for this spatial dependence. One of the main scopes is the spatial prediction of curves when a sample of spatially correlated curves is collected at a set of locations of a region. In the other category of purely spatial functional methods, we find techniques dealing with point pattern data associated with functional marks, and with functional areal data (groups of functional data with a spatial reference). The objectives (among others) are to study the spatial distribution of curves, testing hypothesis about the observed pattern, detection and modelling of such spatial dependence, and identification of spatial clustering among the curves.

Good and focused motivated examples of spatial indexing problems come from many areas of environmental fields including climate studies. Environmental indicators through widely accessible electronic devices are noise-free, and can be seen to have spatial functional characteristics. In these cases the information consists of a set of curves associated to different geographical locations on a spatial domain.

A substantial and interesting illustration about the underlying philosophy of spatial functional statistics can be found in Delicado et al. (2010a) and Ruiz-Medina (2012). These papers provide an overview of spatial functional statistical methods, starting with the simple notion of a spatial functional observation and its diversification into the different types of spatial functional data structures according to the domain where the observation belongs to.

However, searching in the literature, it is clear that amongst the several categories of spatial functional methods, functional geostatistics has been much more developed considering both new methodological approaches and analysis of a wide range of case studies covering a wealth of varied fields of applications. Indeed, a partial aim in writing this special issue is to encourage the development of further methods in all the branches according to the spatial functional perspective. The focus is not only on case studies, which is per se an interesting and very pragmatic angle of new developments, but also on methodological contributions. In fact, the area of spatially dependent functional structures has several specific and difficult aspects that make it a research topic in its own right. The articles selected here move beyond the current state-of-the-art in various ways, from spatial prediction methods to clustering, going along the way through depth measures for spatially dependent functional data.

In the following section we provide a sketch of the current state-of-the-art of spatial functional statistics by describing the general areas of research, including the new selected proposals which are considered into their bibliographical context through some short discussion on the current literature.

2 An overview of the selected approaches and applications

This special issue comprises six papers considering computational and analytical techniques to deal with functional data with spatial dependence. The issue collects works from experts in this area coming from Sweden, Spain, Colombia, United States, and Italy.

In particular, there are four papers dealing with spatial prediction. Briefly, Aguilera-Morillo et al. (2016) deal with prediction for spatial functional variables whose observations are a set of spatially correlated sample curves obtained as realisations of a spatio-temporal stochastic process. Bernardi et al. (2016) propose a method of regression with partial differential regularisations for spatially dependent curves. Then, Bohorquez et al. (2016) show how to perform optimal spatial prediction of a functional variable at unsampled locations. Finally, Espejo et al. (2016) propose a dynamic spatial-depth functional regression model, where the functional response and the covariates are indexed in time, and take their values in the space of square integrable functions over the spatial-depth domain.

Two more papers complete the special issue. Abramowicz et al. (2016) propose a functional non-parametric clustering method which simultaneously clusters and aligns spatially dependent curves. Balzanella et al. (2016) address the problem of getting order statistics for georeferenced functional data by means of depth functions for spatially dependent functional data.

Note that prediction methods in spatial functional statistics have been developed from different points of view as adapted extensions of more classical prediction methods. Kriging techniques and spatial regression methods have been adapted to the case of spatially correlated curves. Kriging Chiles and Delfiner (1999) is a well-known prediction method in classical geostatistics. It allows to predict values of a (scalar) random field based on sampled surrounding data points, weighted according to a spatial covariance function. The assumption under which it is developed (if the process is considered stationary or non-stationary) drives different types of kriging techniques, such as ordinary kriging (OK), universal kriging (UK), indicator kriging, cokriging and others. There has been a number of adaptations of kriging approaches for spatially correlated functional data. After the pioneering work of Goulard and Voltz (1993), the papers of Delicado et al. (2010b) and Nerini et al. (2010) have been crucial. Both propose ordinary kriging approaches allowing to predict a curve at an unsampled site under the assumption of stationarity. They are based on the functional linear point-wise model adapted to the case of spatially correlated curves. The idea behind this procedure consists of a direct adaptation of the more classical prediction problem in geostatistics to curves, after a particular smoothing process has been applied. The prediction problem is then solved by estimating a linear model of coregionalisation to capture the spatial dependence among the fitted coefficients. The main methodological difference among these two approaches is that Nerini et al. (2010) use the condition of orthonormal basis functions, whereas orthogonality is not a required condition in Delicado et al. (2010b). A somehow different definition of an ordinary kriging predictor for functional data can be found in Delicado et al. (2011), and in its implementation in R in Giraldo et al. (2012b).

We note that quite often geostatistical methods assume that the spatial functional process considered is stationary, that is, the mean function is constant (no trend), the variance function is constant, and the covariance function depends on the distance between the locations. However, in many applied cases, the assumption of a constant mean function is clearly not realistic. To address this problem, there have been a number of contributions dealing with this situation (see Caballero et al. 2013; Menafoglio et al. 2013; Ignaccolo et al. 2014; Reyes et al. 2015). In all these cases, the stationarity assumption is relaxed. Caballero et al. (2013) propose a new predictor by extending the classical universal kriging predictor for univariate data to the context of functional data. Menafoglio et al. (2013) establish a kriging theory for random fields in any separable Hilbert space, allowing for the analysis of a broad range of object data, such as curves, surfaces or images. Reyes et al. (2015) generalise the classical residual kriging method used in univariate geostatistics proposing a three step procedure. Finally, by considering more complex forms of non-stationarity when the mean function depends on exogenous variables (either scalar or functional), the work of Ignaccolo et al. (2014) develops the so-called kriging with external drift or regression kriging in a functional data setting.

With a focus on estimation procedures, spatio-temporal modeling and non-parametric spatial regression models for functional data have also been proposed. In the former cases, covariance functions for data evolving in space and time are directly modelled by some kind of spatial-functional model. Some of these alternative methods working with spatio-temporal models for spatial-functional data have been proposed by Yamanishi and Tanaka (2003) and Bel et al. (2010). Bel et al. (2010) use a functional linear model to model the relationship between the genetic diversity in European beech forests and curves of temperature and precipitation reconstructed from the past. In addition, in order to take into account the spatial dependence they estimate the covariance matrix of the residuals in a spatial framework. In contrast, a regression model where both response and predictors are functional data, and the relation among the variables may change over the space is proposed by Yamanishi and Tanaka (2003). In a similar way, a non-parametric kernel regression with scalar response and functional predictors, as observations of a continuous spatial process, is proposed by Dabo-Niang and Yao (2007). With the main aim to answer an important space physics question regarding global changes in the ionosphere, Gromenko (2013) proposes a new functional regression approach that handles unevenly spaced, partially observed curves.

In connection with this substantial amount of research, four new methods proposed in this special issue deal with the problem of prediction for spatially correlated functional data.

Aguilera-Morillo et al. (2016) provide a spatial regression method based on a penalised estimation criterion with the aim to predict a random variable continuously in time and space. In particular a three-dimensional P-spline penalty at the least squares fitting criterion is proposed in order to take handle the spatial component of the data. Their method is compared to the classical ordinary kriging, and its performance is illustrated with simulated data. An application on the Canadian temperature data set, introduced by Ramsay and Silverman (2005), is also analysed. This data set has been many times object of study using functional techniques for spatial data (see good examples in Giraldo et al. (2012a, b), and Delicado et al. (2010a, b).

Bernardi et al. (2016) propose a regression method with partial regularisation with the aim of accounting specifically for the geometry of the domain of interest. The focus is in this case to deal with complex time and space structures by considering two roughness penalties that account separately for the regularity of the field in space and in time. The proposed method has advantages with respect to classical spatial data analysis techniques, since as the authors show, it is able to efficiently deal with data distributed irregularly over shaped domains, with complex boundaries, strong concavities and interior holes. What is evident is that combining the spatial and time texturing allows to take advantage with respect to kriging methods for complex spatial functional data structures. The proposed method is compared via simulation studies to other spatio-temporal techniques, and it is applied to the analysis of the annual production of waste in the towns of Venice province.

These two previous papers propose regression methods that focus on different ways to account for the spatial dependence in the functional data, and to predict a functional observation at a non-observed spatial location. The most relevant difference consists in the way they include the spatial component into the model, and as a consequence the entire estimation process. In the former case, both components are included in a spatial regression model and a combination of an estimation method of spatial smoothing with a P-spline penalty provides a method to predict spatially correlated functional data. In the latter case, based on the idea of regression with differential regularisations, the model merges functional and numerical techniques to deal with the spatial functional covariates.

Espejo et al. (2016) propose a dynamic spatial-depth functional regression model, where the functional response and the covariates are indexed in time, and take their values in the space of square integrable functions over the considered spatial-depth domain. The authors offer a new perspective to exploit the spatial depth ocean temperature field. It is one important challenge in the environmental field that has been analysed by Ruiz-Medina and Espejo (2012, 2013) focusing on the problem of spatial functional extrapolation of ocean surface temperature profiles and surface temperature anomalies.

The problem of prediction of a functional variable has to be considered also in the case where there is cross-variability with other functional variables. To analyse the spatial cross-dependencies with other functional variables is complicated due to the infinite dimensionality of the data. The key difficulty is in specifying and estimating the function responsible for the relationship between distinct variables, that is the cross-covariance function.

To overcome this drawback, Bohorquez et al. (2016) propose a functional cokriging method based on the representation of each function in terms of its empirical functional principal components. The basic idea the authors develop is a generalisation of their previous paper in the univariate case. They thus show how the functional cokriging only depends on the auto-covariance and cross-covariance of the associated scores vectors, which are scalar random fields. In addition, they propose an approach to find optimal sampling designs that ensure the quality of the spatial functional predictions in presence of covariates.

The idea used in the procedure proposed by Bohorquez et al. (2016) has the advantage that it uses the functional principal component representation of each random field. Thus, the functional cokriging method does not require multivariate functional principal component analysis. Among the contents of the paper, there is a part devoted to clarify the characteristics of a coregionalisation model for functional data, where they carefully show the solution they give to the problem they have to ensure that the covariance matrix is positive definite. Several important ideas are emphasised in the paper, among these the possibility to extend functional geostatistics under Gaussianity or goodness-of-fit tests for a joint Gaussian distribution in Hilbert spaces.

All these proposed prediction methods have been proven to provide statistical tools for many applied problems related to environmental, weather, and climate studies.

Another interesting topic of research in these fields is the use of clustering methods to describe local spatial functional characteristics of the observed phenomena. The state-of-the-art about clustering has not been so straightforward developed as much as the prediction problem. To the best of our knowledge there are just few approaches which consider the inclusion of the spatial correlation within clustering methods. Examples of spatial functional clustering are provided in Romano et al. (2010, 2016) and Secchi et al. (2012). These use iterative algorithms to partition geographically referenced data. In particular, a first approach Romano et al. (2010) proposes to classify curves by minimising the spatial variability in each cluster and proposes, as prototype of a cluster, a kriging prediction at an unsampled location. Thus, the optimised objective function performs not only the prediction of the representative curve of each cluster, but it also estimates its location. The approach of Romano et al. (2016) consists of solving an important issue of environmental fields: the description of clusters in terms of spatial dispersion. The method is an extension of the dynamic clustering algorithm to a set of spatially dispersed functions. A further alternative to these methods has been proposed in Secchi et al. (2012), with a bagging strategy in which the nal partitioning of the data is obtained by bagging together weak analysis performed on reduced datasets.

Other contributions provided two types of hierarchical clustering methods. A first one is a generalisation of a hierarchical approach for spatial data to functional data via weighting the dissimilarity matrix by a measure of a spatial functional covariance [see Giraldo et al. (2012a)]. Performances of this method have been shown in Romano et al. (2015). A second one allows to solve the practical problem of identifying groups of stations along a river network which are spatially coherent, calculates the spatial covariance function between functions from sites along a river network, and applies the measure as a weight within the functional hierarchical clustering step [Haggarty et al. (2015)].

Finally, a non-parametric model-based method with a spatially correlated error structure to classify service accessibility patterns for the financial services industry has been proposed by Jiang and Serban (2012).

For completeness of this special issue, two more key topics are developed. Abramowicz et al. (2016) propose a novel method, the Bagging Voronoi K-medoid Alignment algorithm (BVKMA), that jointly handles clustering, misalignment, and spatial dependence of functional data. As claimed by the authors, this method is the first proposal in the literature that jointly deals with these three sources of variability. Moreover, it allows many different families of warping functions to address the problem of misalignment.

And, additionally, introducing the concept of depth function for establishing the centrality? of an observation among a set of functions, and to provide a natural center-outward ordering of the sampled curves is also a hot topic. Depth definitions are mainly obtained as the generalisation of the classical depth concept, and these definitions can come from integrals of univariate depth, from the graphical representation of the functions, or from depth-based projections for functions. A good reference is Lopez-Pintado and Romo (2011), where it is shown that most of the proposed depth functions have degenerate behaviour in infinite dimensional spaces. Only one of these, named the modified version of the Half Region Depth (HRD), does not suffer from such behaviour, and it is simple and computationally fast.

In spatial functional statistics this topic has been addressed by Balzanella and Romano (2015), where the spatial dependence among the curves is introduced in the definition of the band depth. The spatial covariance function plays the role of a weighting scheme among the geostatistical functional data. Indeed, the considered spatial covariance function measures the spatial dependence of all the curves in the space, but does not consider each single contribution that a curve provides to the whole spatial variability. Moreover, it suffers from a degenerate behaviour for some standard probability models in functional spaces.

The contribution of Balzanella et al. (2016) overcomes these problems. It is a generalisation of the graphical approach based on the modified version of the HRD proposed by Lopez-Pintado and Romo (2011) consisting of defining spatial dispersion functions computed for each site of the observed functional data. By introducing the concept of spatial dispersion function as a transformation of the functional data, this proposal has the following advantages: (a) it furnishes a criterion for ranking simultaneously the spatial and the functional component of the data; (b) it allows to define a distribution of the spatial dispersion functions characterised by robust location estimates, such as the median spatial dispersion and the quartile functions.

3 Final conclusions

In this introductory paper to the present special issue we have discussed some new techniques for spatially dependent functional data, opening new areas for future research. We hope that these contributions will further enhance the current interest in statistical methods in the spatial functional framework.

We would like to express our gratitude to the reviewers that collaborated in the edition of this special issue. We are specially grateful to the Editor-in-Chief and the Editorial Board of Stochastic Environmental Research and Risk Assessment for their support.