An Introduction to the Spatio-Temporal Analysis of Satellite Remote Sensing Data for Geostatisticians

Satellite remote sensing data have become available in meteorology, agriculture, forestry, geology, regional planning, hydrology or natural environment sciences since several decades ago, because satellites provide routinely high quality images with diﬀerent temporal and spatial resolutions. Joining, combining or smoothing these images for a better quality of information is a challenge not always properly solved. In this regard, geostatistics, as the spatio-temporal stochastic techniques of geo-referenced data, is a very helpful and powerful tool not enough explored in this area yet. Here, we analyze the current use of some of the geostatistical tools in satellite image analysis, and provide an introduction to this subject for potential researchers.


Introduction
The spatio-temporal analysis of satellite remote sensing data using geostatistical tools is still scarce when comparing with other kinds of analyses. In this chapter we provide an introduction to this field for geostatisticians, empathising the importance of using the spatio-temporal stochastic methods in satellite imagery and providing a review of some applications (Sagar and Serra 2010). We explain how to proceed for accessing remote sensing data, and which are the common tools for downloading, pre-processing, analysing, interpolating, smoothing and modeling these data. The chapter encloses six additional sections where a short explanation of the state of the art in the analysis of remote sensing data using free statistical software is given. Particular attention is devoted to the use of geostatistical tools in this subject. Section 13.2 explains the profile and the main features of the most popular satellites. It also encompasses Sect. 13.2.1 for describing some R packages for importing, analysing, and managing satellite images. Section 13.3 explains how to retrieve two derived variables, the normalized difference vegetation index (NDVI) and the land surface temperature (LST). In Sect. 13.4 some common methods of pre-processing data after downloading satellite images are reviewed. Section 13.5 explains the importance of the spatial interpolation in remote sensing data and reviews the most popular interpolation methods. The actual scenario of the spatio-temporal geostatistics is reviewed in Sect. 13.6, where an additional subsection describes some R packages for using spatial and spatio-temporal geostatistics techniques with satellite images. The paper ends up with some conclusions in Sect. 13.7.

Satellite Images
Satellite images are available since more than four decades ago, and since then there has been a notable improvement in quality, quantity, and accessibility of these images, making it easier to extract huge amounts of data from all over the Earth. We can retrieve data from the land or the ocean, from the coast or the mountains, and also from the atmosphere where advanced sensors give the opportunity of monitoring meteorological variables that are crucial for the study of the climatic change, the phenology trend, the changes in vegetation or many other environmental processes.
Remote sensing refers to the process of acquiring information from the Earth or the atmosphere using sensors or space shuttles platforms. Therefore, remote sensing is born as a crucial necessity when using satellite images for analyzing and converting them into different frames of data that can be managed with specific software. Nowadays, Landsat, Modis, Sentinel or Noaa are some of the most popular satellite missions among researchers and practitioners of remote sensing data because of the free accessibility. Next, we summarize the main characteristics of these missions: . Landsat supplies high resolution visible and infrared imagery, with thermal imagery, and a panchromatic image also available from the ETM+ sensor. Landsat also provides land cover facility to complement overall project goals of distributing a global, multi-temporal, multi-spectral and multi-resolution range of imagery appropriate for land cover analysis. 2. The SENTINEL satellites were launched from 2013 onwards and include radar, spectrometers, sounders, and super-spectral imaging instruments for land, ocean and atmospheric applications (Aschbacher and Milagro-Pérez 2012). In particular, the multispectral instrument on-board Sentinel-2 aims at measuring the Earth reflected radiance through the atmosphere in 13 spectral bands spanning from the Visible and Near Infra-Red to the Short Wave Infra-Red. The main goal of this satellite is the monitoring of rapid changes such as vegetation characteristics during growing seasons with improved change detection techniques. 3. NOAA is the acronym of National Oceanic and Atmospheric Administration.
The satellite observations of the atmosphere on a global scale began more than 40 years ago. In the URL (NOAA 2017), it is said that over 150 data variables from satellites, weather models, climate models, and analyses are available to map, interact with, and download using NOAA View's Global Data Explorer. NOAA generates more than 20 terabytes of daily data from satellites, buoys, radars, models, and many other sources. All of that data are archived and distributed by the National Centers for Environmental Information. 4. Moderate Resolution Imaging Spectroradiometer (MODIS) is a key instrument aboard the TERRA and AQUA satellites. See the URL (MODIS 2017) for details. TERRA's orbit around the Earth is timed so that it passes from north to south across the equator in the morning, while AQUA passes south to north over the equator in the afternoon, providing a high temporal resolution of images all over the world. TERRA MODIS and AQUA MODIS are viewing the entire Earth's surface every 1 to 2 days, acquiring data in 36 spectral bands, or groups of wavelengths. These data facilitate the global dynamics and processes occurring on the land, in the oceans, and in the lower atmosphere.
Remote sensing data of some of these missions can be accessed via the free statistical software R, publicly accessible in R Core Team (2017).

Access and Analysis of Satellite Images with R
This subsection provides a summary of some R packages that can be used for downloading, importing, accessing, processing, and smoothing remote sensing data from satellite images.  (Gerber et al. 2016) fills missing values in satellite data and develops new gap-fill algorithms. The methods are tailored to images observed at equallyspaced points in time. 3. gdalUtils (R Core Team 2017) gives wrappers for the geospatial data abstraction library (GDAL) utilities. 4. gimms (R Core Team 2017) provides a set of functions to retrieve information about GIMMS NDVI3g files 5. landsat (Goslee 2011) includes relative normalization, image-based radiometric correction, and topographic correction options. 6. landsat8 (Survey 2015) provides functions for converted Landsat 8 multispectral satellite imagery rescaled to the top of atmosphere (TOA) reflectance, radiance and/or at satellite brightness temperature using radiometric rescaling coefficients provided in the metadata file (MTL file). 7. mod09nrt (R Core Team 2017) processes and downloads MODIS Surface reflectance Product HDF files. Specifically, MOD09 surface reflectance product files, and the associated MOD03 geo-location files (for MODIS-TERRA). 8. MODIS (R Core Team 2017) allows for downloading and processing functionalities for the Moderate Resolution Imaging Spectroradiometer (MODIS) 9. modiscloud (Nicholas J. Matzke 2013) is designed for processing downloaded MODIS cloud product HDF files and derived files 10. raster (R Core Team 2017) is a very powerful library for the geographic data analysis and modeling 11. rgdal (R Core Team 2017) provides bindings for the geospatial data abstraction library. 12. satellite (Nauss et al. 2015) provides a variety of functions which are useful for handling, manipulating, and visualizing remote sensing data.

Derived Variables from Remote Sensing Data
When a satellite image is accessed, an assorted number of bands are provided. The combination of these bands can facilitate different types of remote sensing data. For example, extracting the Normalized Difference Vegetation Index (NDVI) can be done by a simple combination of bands. NDVI is an important index that reflects vegetation growth and it is closely related to the amount of photosynthetically absorbed active radiation as indicated by Slayback et al. (2003) and Tucker et al. (2005). It is calculated using the radiometric information obtained for the red (R) and nearinfrared (NIR) wavelengths of the electromagnetic spectrum in the following way: (Rouse Jr et al. 1974). As mentioned in Sobrino and Julien (2011), this parameter is sensitive to the blueness of the observed area, which is closely related to the presence of vegetation. Although numerical limits of NDVI can vary for the vegetation classification, it is widely accepted that negative NDVI values correspond to water or snow. NDVI values close to zero could correspond to bare soils, yet these soils can show a high variability. Values between 0.2 and 0.5 (approximately) to sparse vegetation, and values between 0.6 and 1.0 conform to dense vegetation such as that found in temperate and tropical forests or crops at their peak growth stage. Therefore, NDVI provides a very valuable instrument for monitoring crops, vegetation, and forestry, and it is directly calculated in specific images by the aforementioned satellites missions. On the left of Fig. 13.1 a Sentinel NDVI satellite image of Funes, a village of Navarra (Spain) is shown, and on the right of the same Figure, the NDVI for the whole region of Navarra. Another important variable derived with satellite images is the land surface temperature (LST), that can be retrieved with different algorithmic procedures. As an example Sobrino et al. (2004) compare three methods to retrieve the LST from thermal infrared data supplied by band 6 of the Thematic Mapper (TM) sensor onboard the Landsat 5 satellite. The first is based on the radiative transfer equation using in situ radiosounding data. The others are the mono-window algorithm developed by Qin et al. (2001) and the single-channel algorithm developed by Jiménez-Muñoz and Sobrino (2003). Many satellites platforms provide specific images of LST all over the Earth, because it is also a very outstanding variable for many environmental process. Figure 13.2 shows the daily land surface temperature in Navarra (Spain) the 13th of July 2015 from TERRA satellite.

Pre-processing
The atmosphere is between the satellite and the Earth, and its effects over the electromagnetic radiation caused by the satellite can distort, blur or degrade the images. These effects must be corrected before the image processing. The correction consists of composing several images into a new single one. Different algorithms have been developed in the literature according to the derived variable. The most common method with NDVI is the maximum value composite (MVC) procedure (Holben 1986) that assigns the maximum value of the time-series of pixels across the composite period. Alternative techniques include using a bidirectional reflectance distribution function (BRDF-C) to select observations and the constraint view angle maximum value composite (CV-MVC) (MODIS 2017). For LST day/night it is common to average the cloud-free pixels over the compositing period (Vancutsem et al. 2010). Nowadays, many composite images can be directly downloaded with different spatial and temporal resolutions. For example, raw daily images can be downloaded from AQUA or TERRA satellites all over the world, but usually composite images are at least of weekly or bi-weekly temporal resolution.
Spatial and temporal resolutions are also different from the same or different satellites. High temporal resolution can be useful when tracking seasonal changes in vegetation on continental and global scales, but when downscaling to small regions, a higher spatial resolution is needed, and frequently with lower temporal resolution. At this step, numerical, physical or mechanical analyses solve the image pre-processing. Later, removing the effect of clouds or other atmospheric effects is also required, otherwise remote sensing data can be inaccurate. Sometimes, the highest presence of clouds determine the dropout of several images, but if they are only partially clouded, different approaches for eliminating these effects can be used. Noise reduction in image time series is neither simple nor straightforward. Many alternatives have been provided. For example R.HANTS macro of GRASS, SPIRITS, BISE, TIME-SAT, GAPFILL or the CACAO methods are very well spread. R.HANTS performs an harmonic analysis of time series in order to estimate missing values and identify outliers (Roerink et al. 2000). SPIRITS is a software that processes time series of images (Eerens et al. 2014). It was developed by PROBA-V data provider and gives four smoothing options, including MEAN (Interpolate missing values & apply Running Mean Filter RMF) and BISE (Best Index Slope Extraction), (Viovy et al. 1992). TIMESAT uses numerical procedures based on Fourier analysis, Gauss, double logistic or SavitzkyGolay filters (Jönsson and Eklundh 2004). GAPFILL uses quantile regression to produce smoothed images where the effect of the clouds have been reduced. Usually, every software has different requirements with regard to the number of images necessary for smoothing (Atkinson et al. 2012). Finally, CACAO software (Verger et al. 2013) provides smoothing, gap filling, and characterizing seasonal anomalies in satellite time series.
All these procedures give composite images that are smoothed versions of the raw images, but very often they are not completely free of noise. Many of the attributes that can be extracted from the combination of satellite image bands are still vulnerable to many atmospheric or electronic accidents. For example, highly reflective surfaces, including snow and clouds, and sun-glint over water bodies may saturate the reflective wavelength bands, with saturation varying spectrally and with the illumination geometry (Roy et al. 2016). Land surface temperature or normalized vegetation index are examples of attributes where these type of errors can be present. Therefore, after pre-processing is done, interpolation and smoothing methods can be very useful for drawing or detecting trend changes, clustering or many other processes on remote sensing data.

Spatial Interpolation
Likely, interpolation and classification are among the most used tools with remote sensing data. Classification of satellite images in supervised or unsupervised versions are important research areas not only with satellite images but also with big data and data mining where there are a great number of algorithmic procedures (Benz et al. 2004). Here, we are more interested in interpolation as it is more closely related to geostatistics.
Interpolation has been widely used in environmental sciences. Li and Heap (2011) revise more than 50 different spatial interpolation methods that can be summarized in three categories: non-geostatistical methods, geostatistical methods, and combined methods. All of them can be represented as weighted averages of sampled data. Among the non-geostatistical methods the authors find: nearest neighbours, inverse distance weighting, regression models, trend surface analysis, splines and local trend surfaces, thin plate splines, classification, and regression trees. The different versions of simple, ordinary, disjunctive or model-based kriging are among the geostatistical methods. The combined methods include: trend surface analysis combined with kriging, linear mixed models, regression trees combined with kriging or regression kriging.
Recently, Jin and Heap (2014) present an excellent review of spatial interpolation methods in environmental sciences introducing 10 methods from the machine learning field. These methods include support vector machines (SVM), random forests (RF), neural networks, neuro-fuzzy networks, boosted decision trees (BDT), the combination of SVM with inverse distance weighting (IDW) or ordinary kriging (OK), the combination of RF with IDW or OK (RFIDW, RFOK), general regression neural network (GRNN), the combination of GRNN with IDW or OK, and the combination of BDT with IDW or OK. Although all these methods were not developed specifically for remote sensing data, nowadays the majority of them have been implemented in different packages of the free statistical software R, and can be used with satellite images. Many of these methods are ready to use and interpret, but the family of kriging methods as the core of geostatistics, are preferred and widely used.

Spatio-Temporal Interpolation
Since the publication of the seminal book Spatial Autocorrelation (Cliff and Ord 1973), and at latter date Spatial Statistics (Ripley 1981), Statistics for Spatial Data (Cressie and Wikle 2015), and Multivarate Geostatistics (Wackernagel 1995) books, there has been a rapid growth of spatial geostatistical methods, as they are essential tools for interpolating meteorological, physical, agricultural or environmental variables in locations where these variables are not observed.
The use of spatial geostatistics with remote sensing data is also very well widespread, and its procedures are present in many specific softwares of satellite image analysis (Stein et al. 1999). Geostatistics techniques can help to explore and describe the spatial variability, to design optimum sampling schemes, and to increase the accuracy estimation of the variables of interest. These models can be enriched with auxiliary information coming from classified land cover or historical information (Curran and Atkinson 1998). Kriging is the most popular geostatistical method with several versions such as block kriging, universal kriging, ordinary kriging, regression kriging or indicator kriging. It provides the spatial interpolation of different spatial variables through the use of spatial stochastic models, and it is the best linear unbiased predictor under normality assumptions when using spatially dependent data.
However, the extension to the spatio-temporal geostatistics methods is more complicated. Time series models typically assume a regularly sampling over time, but the temporal lag operator cannot be easily generalized to the spatial domain, where data are likely irregularly sampled (Phaedon and André 1999). Scales of time and space are different, therefore defining joint spatio-temporal covariance functions is not a trivial task (De Iaco et al. 2002). Recently, Cressie and Wikle (2015) show the state of the art in this area and explain the difficulties of inverting covariance matrices in spatio-temporal kriging, because it becomes problematic without some form of separable models or dimension reduction. Modelling the spatio-temporal dependence is frequently case-specific. Therefore, yet the presence of the spatio-temporal keyword is abundant in many satellite imagery papers, the use of spatio-temporal stochastic models is scarce. Very often, spate-time refers only to descriptive analyses of time series of satellite images where every image is analyzed as a set of separate pixels, i.e., when estimating trends, or trend changes, statistical methods of univariate time series are used for every pixel. For example, when completing, reconstructing or predicting the spatial and temporal dynamics of the future NDVI distribution many papers use a time series of images (Forkel et al. 2013;Tüshaus et al. 2014;Klisch and Atzberger 2016;Wang et al. 2016;Liu et al. 2015;Maselli et al. 2014). These studies include temporal correlation of individual pixels at different resolutions but ignoring spatial dependence among them.
Spatio-temporal stochastic models use the spatial or temporal dependence to estimate optimally local values from sampled data. In satellite images, sampled data can be a huge amount of spatially and temporally dependent pixels, if a sequence of images is involved. We briefly review in what follows some stochastic spatio-temporal models that can be used when analysing remote sensing data.
1. Spatio-temporal kriging (Gasch et al. 2015). This paper uses spatio-temporal R packages for fitting some of the following spatio-temporal covariance functions: separable, product-sum, metric and sum-metric classesin a spatio-temporal kriging model, and a random forest algorithm for modeling dynamic soil properties in 3-dimensions. 2. State-space models (Cameletti et al. 2011). The authors apply a family of statespace models with different hierarchical structure and different spatio-temporal covariance function for modelling particular matter in Piemonte (Italy). 3. Hierarchical spatio-temporal model (Cameletti et al. 2013). The paper introduces a hierarchical spatio-temporal model for particulate matter (PM) concentration in the North-Italian region Piemonte. The authors use stat-space models involving a Gaussian Field (GF), affected by a measurement error, and a state process characterized by a first order autoregressive dynamic model and spatially correlated innovations. The estimation is based on Bayesian methods and consists of representing a GF with Matérn covariance function as a Gaussian Markov Random Field (GMRF) through the Stochastic Partial Differential Equations (SPDE) approach. Then, the Integrated Nested Laplace Approximation (INLA) algorithm is proposed as an alternative to MCMC methods, giving rise to additional computational advantages (Rue et al. 2009). 4. Spatio-temporal data-fusion (STDF) methodology (Nguyen et al 2014). This method is based on reduced-dimensional Kalman smoothing. The STDF is able to combine the complementary GOSAT and AIRS datasets to optimally estimate lower-atmospheric CO2 mole fraction over the whole globe. 5. Hierarchical statistical model (Kang et al. 2010). This model includes a spatiotemporal random effects (STRE) model as a dynamical component, and a temporally independent spatial component for the fine-scale variation. This article demonstrates that spatio-temporal statistical models can be made operational and provide a way to estimate level-3 values over the whole grid and attach to each value a measure of its uncertainty. Specifically, a hierarchical statistical model is presented, including a spatio-temporal random effects (STRE) model as a dynamical component and a temporally independent spatial component for the fine-scale variation. Optimal spatio-temporal predictions and their mean squared prediction errors are derived in terms of a fixed-dimensional Kalman filter. 6. Three-stage spatio-temporal hierarchical model (Fassò and Cameletti 2009).
This work gives a three-stage spatio-temporal hierarchical model including spatio-temporal covariates. It is estimated through an EM algorithm and bootstrap techniques. This approach has been used by (Militino et al. 2015) for interpolating daily rainfall data, and for estimating spatio-temporal trend changes in NDVI with satellite images of Spain from 2011-2013 (Militino et al. 2017). 7. Space-varying regression model (Bolin et al. 2009). In this space-varying regression model the regression coefficients for the spatial locations are dependent. A second order intrinsic Gaussian Markov Random Field prior is used to specify the spatial covariance structure. Model parameters are estimated using the Expectation Maximisation (EM) algorithm, which allows for feasible computation times for relatively large data sets. Results are illustrated with simulated data sets and real vegetation data from the Sahel area in northern Africa.

Geostatistical R Packages
In this section we briefly describe some of the most useful R packages for geostatistical analysis, including spatial and spatio-temporal interpolation in satellite imagery.
1. FRK (Cressie and Johannesson 2008) means fixed rank kriging and it is a tool for spatial/spatio-temporal modelling and prediction with large datasets.
2. geoR (Ribeiro Jr et al. 2001) offers classical geostatistics techniques for analysing spatial data. The extension to generalized linear models was made in geoRglm package (Christensen and Ribeiro 2002). 3. georob (R Core Team 2017) fits linear models with spatially correlated errors to geostatistical data that are possibly contaminated by outliers. 4. geospt (Melo et al. 2012) estimates the variogram through trimmed mean and does summary statistics from cross-validation, pocket plot, and design of optimal sampling networks through sequential and simultaneous points methods. 5. geostatsp (Brown 2015) provides geostatistical modelling facilities using raster.
Non-Gaussian models are fitted using INLA, and Gaussian geostatistical models use maximum likelihood estimation. 6. gstat (Pebesma 2004) does spatio-temporal kriging, sequential Gaussian or indicator (co)simulation, variogram and variogram map plotting utility functions. 7. RandomFields (Schlather et al. 2015) provides methods for the inference on and the simulation of Gaussian fields. 8. spacetime (Pebesma et al. 2012) gives methods for representations of spatiotemporal sensor data, and results from predicting (spatial and/or temporal interpolation or smoothing), aggregating, or sub-setting them, and to represent trajectories. 9. spatial (Venables and Ripley 2002) provides functions for kriging and point pattern analysis. 10. spatialEco (Evans 2016) does spatial smoothing, multivariate separability, point process model for creating pseudo-absences and sub-sampling, polygon and point-distance landscape metrics, auto-logistic model, sampling models, cluster optimization and statistical exploratory tools. It works with raster data. 11. SpatialTools (R Core Team 2017) contains tools for spatial data analysis with emphasis on kriging. It provides functions for prediction and simulation. 12. spBayes (Finley et al. 2007) fits univariate and multivariate spatio-temporal random effects models for point-referenced data using Markov chain Monte Carlo (MCMC).

Conclusions
The multitemporal Earth observation satellites have been very well developed since the seventies, and along with the free availability of millions of satellite images, the number of publications of remote sensing data with geostatistical techniques has been rapidly increased. But unfortunately, not all published papers deriving, analysing or monitoring spatio-temporal evolutions, spatio-temporal trends or spatio-temporal changes are necessarily geostatistical papers, because they do not really use spatio-temporal stochastic models. These models are still scarce in remote sensing data because many of these models are computationally very intensive, or because they are not so broadly applicable as the spatial models are. The solutions found in the literature are very well fitted to specific problems, but we cannot always plug-in to other applications. The use of time series analysis in remote sensing opens a great window of opportunities for monitoring, smoothing, and detecting changes in large series of satellite images, but there are still many remote sensing papers ignoring the spatial dependence when analysing time series of images (Ban 2016). Instead, a huge discretization of the problem is presented where time-series of pixels are treated as spatially independent. Nowadays, the upcoming opportunities for geostatisticians in remote sensing data are not based on the use of spatial models and time series separately, but on the use of spatial, temporal, or spatio-temporal stochastic models embedding both types of dependencies when necessary. Moreover, a single free statistical software like R is a powerful tool for downloading, importing, accessing, exploring, analysing and running advanced statistical modelling with remote sensing data in a row.