Abstract
A selective variance reduction (SVR) script is presented that applies linear regression models to the principal components (PCs) of multitemporal night monthly averaged land surface temperature (LST) imagery, in an attempt to spit the variance associated to elevation, latitude, longitude. The recently released version 6, MODIS LST data (MYD11C2) with spatial resolution 0.05° is used while the method is applied in SW USA. The innovation relies on the use of unstandardized PCs. Thus, the reconstructed LST should express the deviation in degrees Celsius, from the elevation, latitude, longitude predicted LST. The reconstructed data quantifies the temporal and spatial patterns of thermal anomalies. The modeling of the frequency distributions of the reconstructed LST imagery indicate that possibly snow melting and variations in water table depth are associated to the negative thermal anomaly observed during the Spring. The GNU OCTAVE 4.0 SVR software implementation is available at http://selectivevariancereduction.sourceforge.net for testing and evaluation.
Introduction
Currently land surface temperature (LST) data sets are computed from the satellitebased remotely sensed images with high temporal resolution at a moderate resolution scale, allowing the day and night monitoring of earth’s surface (Wan, 2013; 2014). An example being the LST acquisitions from MODerateresolution Imaging Spectroradiometer (MODIS) on board the Aqua satellite (Liu et al. 2015; Miliaresis 2014b).
The LST data sets can be used to forward many research questions in landscape modelling (Voinov et al. 2004), in hydrologic processes simulation (Guo et al. 2016), in climate change research (Wilcke and Bärring 2016), in vegetation growth studies (Miliaresis 2014b).
The identification and mapping of thermal anomalies is a key issue in environmental analysis (Friedel 2012; Li et al. 2010). Miliaresis (2009) defined LST anomalies from a time series of LST imagery as regions presenting significantly higher or lower LST than their surrounding area. On the other hand, LST is correlated to elevation (H), latitude (LAT), and longitude (LON) so the quantification of thermal anomalies in vast regions is difficult (Miliaresis 2012c). In this context, Miliaresis (2012a, b) presented a method for H, LAT, LON decorrelation stretch of multitemporal night monthly LST imagery. The method was extended to account for distance form the coastline and applied in vast regions in Zagros Ranges (Miliaresis 2013) and in Antarctica (Miliaresis 2014a).
In the previous research efforts (Miliaresis 2012a, b, c, 2013, 2014a) the computation of PCs from the crosscorrelation matrix resulted in the expression of thermal anomalies as normal scores and limited the environmental applications of the method. In addition the evaluation of thematic information content of the reconstructed imagery was based on the interpretation of the spatial pattern (cluster maps) and the temporal pattern (cluster centroids). Clustering applies a short of generalization that under certain circumstances could hide data characteristics (Miliaresis 2013, 2014a). Thus, there is the need to study and model the frequency distributions of the reconstructed LST data in an attempt to quantify the thermal anomalies.
A major improvement has been applied to the MODIS data products since a new processing algorithm is applied (Wan 2013) and the version 6 data products (Wan 2014) are gradually released to the public (MYD11C2.006 2016). Thus, there is also the need to evaluate the version 6, MODIS LST data in the context of SVR method.
The software implementation of the SVR method as well as data for a specific study area are not available to the public for testing and evaluation purposes. On the hand, the freely available version 4.0 of GNU OCTAVE data analysis and data modeling software is released (Octave 2015) that runs under any operating system (Windows, Linux and Mac OS X). Octave 4.0 includes a graphical user interface, support for objectoriented programming, better compatibility with Matlab, and many new and improved functions.
The aim of this research effort is to implement the H, LAT, LON decorrelation stretch algorithm into OCTAVE 4.0 environment in an attempt to provide a tool that will allow the wide application of the SVR method to various scientific fields. The method is tested with new version MODIS LST data (MYD11C2.006 2016), in SW USA.
In methodological terms, an unstandardized variant of the SVR method is defined and used in attempt to express thermal anomalies in the reconstructed imagery as deviation in degrees Celsius, from the elevation, latitude, longitude predicted LST. In addition, the modeling of the frequency distributions of the reconstructed imagery will possibly reveal hidden thermal anomaly characteristics in the study area.
Methodology
In order to minimize the effect of H, LAT and LON to the multitemporal LST dataset a short of data transformation is required in order to produce a new set of images that should present high correlations to the three independent variables under consideration (Miliaresis 2012a). In this context, principal components analysis (PCA) is a linear transformation technique that produces a set of images known as principal components (PCs) that are uncorrelated with one another and are ordered in terms of the amount of variance (eigenvalues) they explain from the original image set (Jolliffe 2002). In previous research efforts (Miliaresis 2012a, b, c, 2013, 2014a) principal components analysis (PCA) is applied, by computing the eigenvalues and eigenvectors from the crosscorrelation matrix of LST data. So, the resulting eigenvalues and eigenvectors correspond to the standardized LST data (each month presents mean equal to zero and standard deviation equal to 1). Then, PCs are computed from the linear combination of eigenvectors and the corresponding pixel values of the initial images (Mather and Koch 2011). Finally, linear regression models are applied to the first two PCs and spit the variance associated to H, LAT and LON that is included in the predicted images (Miliaresis 2012a, b, c).
ANOVA table for each regression verify the statistical significance (Miliaresis 2013, 2014a). The model performance is further assessed by the R^{2} (R is the multiple correlation coefficient between the independent variables and the dependent variable) that represents the extent of variability in the dependent variable explained by all the independents variables (Landam and Everitt 2004). Then, the multitemporal data set is reconstructed by considering the residual images for the first 2 PCs as well as the later PCs.
Study area
The study area (Fig. 1) is bounded by longitudes −124° to −112° (West) and latitudes 32° to 44° (North) and includes the states of California, Nevada, Utah and Arizona. NW of the study area (California), the climate is characterized by moderately cold winters with heavy snowfall on the mountains (Sierra Nevada Ranges) and warm, very dry summers with limited rainfall, especially in the south (Wang and Gillies 2012).
The central and the eastern part of the study area (Nevada, Arizona and Utah) is mostly formed by a series of parallel mountain ranges intervening flat basins (Fig. 1). The climate is generally semiarid or arid with warm summers and cold winters but this varies by location and elevation (Wang and Gillies 2012) since some mountainous areas are high enough in elevation to experience an Alpine climate. The majority of streams and rivers flow into desert sinks or closedbasin lakes while Colorado River crosses Grand Canyon in SE (Barnett and Pierce 2008). Southerly, the study area is occupied by Mojave Desert (California) and by Sonoran Desert (Arizona).
Data
The SRTM30 digital elevation model (DEM) (Farr and Kobrick 2000; SRTM30 DEM 2015) with spatial resolution equal to 0.00833 degrees (approximately 1 km at the equator) provides the elevation representation of the study area (Fig. 1). A geographic latitude/longitude grid is used with WGS 84 as horizontal datum (SRTM30 DOC 2015). The elevation range is in between −83 and 4097 m. Negative elevations (below sea level) are observed in Death Valley, (California).
Around local solar time 01:30, 10:30, 13:30 and 22:30 LST data are acquired from the MODIS instrument on board the Aqua and Terra polar orbiting satellite (MYD11C2 2011). The LST accuracy according to Wan (2014) is better than 1 K (0.5 K in most cases) under real clearsky conditions. Data from real clearsky conditions within a calendar month are averaged to yield the MYD11C2 for Aqua and MOD11C3 for Terra products (Wan 2013). MYD11C2 and MOD11C3 data sets provide a continuous (monthly averaged LST) sampling of the earth’s surface with a spatial resolution of 3 min (0.05° corresponding to 5.6 km approximately at the equator) referenced to a geographic latitude/longitude grid, with WGS 84 being the horizontal datum (Wan 2013).
A major improvement has been applied to the MODIS data products since a new processing algorithm is applied (Wan 2013) and the version 6 data products (Wan 2014) are gradually released to the public (MYD11C2.006 2016). The Aqua MODIS night (acquired daily on 01:30) monthly averaged LST data is used. The 12 night monthly averaged LST images for 2007 are visualized in Fig. 2. The monthly LST frequency distributions are presented in Fig. 3.
The data might be also represented by the cross correlation matrix in Table 1 that indicates a rather season dependent correlation of LST with H, LAT and LON.
Density slicing of the November LST image is presented in Fig. 4. The one slice includes the pixels with LST <0 °C while the coastal regions include the pixels with LST >0 °C.
SVR software implementation
SVR is a modular, flexible, opensource GNU Octave 4.0 script for selective variance reduction of multitemporal data sets. The SVR script, the supporting functions, the visualization scripts as well as the study area data are freely available under GNU Version 3 General Public License. A website has been setup to facilitate the distribution (https://sourceforge.net/projects/selectivevariancereduction/). Various scripts and functions are included that allow alternative processing options and visualizations.
There are 48,224 data (land) pixels in the study area. The data files are stored in a single Matlab file, under the name california.mat that includes 4 matrices. The vector representation is selected, and so each data element (vector) includes the 12 monthly LST values of a specific pixel. Thus, there are 48,224 vectors (rows) in the LST matrix and 12 columns corresponding to the monthly averaged night LST from January to December 2007. The LST vectors are visualized in Fig. 5. There are 3 more onedimensional matrices, named H, LAT and LON that correspond to the elevation, latitude and longitude per vector (pixel).
The main script is termed SVR.m (Table 2). SVR is an acronym for selective variance reduction, indicating that the variance associated to H, LAT and LON is subtracted from the multitemporal LST data.
There are two functioncalls in SVR script (Table 2):

a.
PCAfunc.m (Table 3) that computes the eigenvalues and eigenvectors from the variance covariance matrix of LST, and

b.
Normal_equation.m (Table 3) that performs the linear regressions of PC1 and PC2 versus H, LAT and LON and computes the two residual images that are used in LST image reconstruction.
The Eig function (Table 3) uses the variance–covariance matrix for the computation of PCs (Eig 2015), thus the reconstructed LST imagery should express thermal anomalies in degrees Celsius. Translation by mean is applied before Eig function is applied (Table 3) in an attempt to improve the accuracy of numerical computations. Translation by mean of the LST variables (instead of using the raw monthly data), use the difference between the variables (monthly averaged LST) and their sample means. Translation does not affect the interpretation because the variances of the original variables are the same as those of the translated variables.
The eigenvectors returned by Eig function (Eig 2015) of OCTAVE are not ordered. That is why in PCAfunc.m (Table 3), the eigenvectors are sorted in a descending variance (eigenvalue) order. The PCs for the LST data of the study area are presented in Table 4.
In SVR.m script (Table 2), the PCs (Table 4) are computed from the linear combination of eigenvectors and the corresponding pixel values of the initial images.
The contribution of the independent variables (H, LAT, LON) to PC1 and PC2 (dependent variables) is quantified by the linear regression models in Eqs. (1) and (2).
The R^{2} (Table 5) indicates the amount of variance explain by the multiple lineal regression models (Landam and Everitt 2004).
The analysis of variance (ANOVA) tables for Eqs. (1) and (2) are presented in Table 5. The Fstatistic and the tstatistic (Landam and Everitt 2004) combined are used in estimating, (a) the success of the regression models and (b) for adding or deleting variables (the significance of independent variables) respectively. The Ftest value for Eq. (1) indicates the overall significance of the regression, since it far exceeds the Fcritical value (26.12) at the 0.01 significance level (Table 5). For Eq. (1), the coefficients for H, LAT and LON (Table 5) express the individual contribution of the independent variable to PC1. The absolute values of the t test for the 3 independent variables (Table 5) far exceed the tcritical value (2.58) at two tailed 0.01 significance level and hence their coefficients depart significantly from 0.
The ANOVA table (Table 5) for Eq. (2) also verifies the overall significance of the multiple regression model for PC2, as well as the significance of the 3 independent variables.
The SVR.m script reconstruct the PC scores (Scores 2) in Table 2, by considering the regression residuals of PC1 and PC2 plus the PC3 to PC12 components. Then the inverse transformation (Table 2) computes the reconstructed LST (RLST) images from the multiplication of Scores2 matrix times the transpose matrix of eigenvectors (Table 4).
The 12 night monthly averaged reconstructed LST (RLST) images for 2007 are visualized in Fig. 6. The RLST frequency distribution per month are presented in Fig. 7 while the RLST vectors are presented in Fig. 8. Descriptive statistics (Landam and Everitt 2004) for the RLST frequency distributions are available in Table 6.
Discussion of results
Table 1 indicates a season dependent correlation in between H, LAT, LON and LST as well as that LST decreases with increasing LAT and H.
If a different function (not the Eig 2015) is used for the computation of eigenvectors (Table 3), then some or all of the PCs columns in Table 4 might have opposite signs. This is ok, since there is no “natural” orientation for PCs (Jolliffe 2002; Mather and Koch 2011). So, the PCs axes pointing has no implication to SVR computation. The key issue of Eig function is that the eigenvectors are computed from variance–covariance matrix and not from the correlation matrix (Table 1). That is why, the RLST imagery express thermal anomalies in degrees Celsius (Figs. 6, 7). So in the current implementation, RLST per pixel expresses the LST deviation from the elevation, latitude, longitude predicted LST in degrees Celsius.
On the contrary, if the correlation matrix is used then the standardized thermal anomalies will be presented in each reconstructed LST image (Miliaresis 2012a, b, c, 2013, 2014a). Thus, LST thermal anomalies for every pixel will be in the range [−1, 1]. These values are either positive or negative depending on the normal scores per month of the multitemporal dataset. Under these circumstances (Miliaresis 2012a, b, c, 2013, 2014a) there is not a direct correspondence in between RLST values and both the magnitude and the sign (positive or negative) of the thermal anomalies.
The first 2 PCs accounts for the 97.4 % of the variance evident within the multitemporal imagery (Table 4) while PC3 to PC12 components account only for the 2.6 % of the total variance evident in the initial data. The Eqs. (1) and (2) explain 76.12 % of the total variance evident in the multitemporal LST data. More specifically:

For Eq. (1), R^{2} equals to 0.802 (Table 5). Thus, according to the PC1 eigenvalue (percent variance of PC1 equals to 88.38 % in Table 4), the 70.8 % (0.802 × 88.38 %) of the total variance of the LST data is explained by Eq. (1).

For Eq. (2), R^{2} equals to 0.601 (Table 5). Thus, according to the PC2 eigenvalue (the percent variance of PC2 equals to 9.02 % in Table 4), the 5.42 % (0.601 × 9.02 %) of the total variance of the LST data is explained by Eq. (2).
Thus, the two residual images for Eqs. (1) and (2) accounts for the 17.58 % (88.38–70.8) and 3.6 % (9.02–5.42) respectively, of the total variance evident in the multitemporal dataset. So the RLST imagery (Fig. 3) accounts only for the 23.78 % (17.58 + 3.6 + 2.6 %) of the total variance of the initial data that is independent of H, LAT and LON.
The LST frequency distributions (Fig. 3) are bimodal. Density slicing of the November LST image (Fig. 4) outlines two regions, (a) the coastal one with positive LST and elevation statistics equal to 885 ± 654 m and (b) a continental one with negative LST and elevation statistics equal to 1809 ± 476 m. So H, LAT and LON do play a major role in the observed values of LST. On the contrary, the RLST frequency distributions (Fig. 7) present means that approach zero (Table 6).
According to an empirical rule (Daniel and Tennant 2001), when the absolute value of the skew exceeds a value, such as 0.5, then the distribution is sufficiently asymmetrical to cause concern that the dataset may not represent a normal distribution. In the current case study (Fig. 7) the RLST distributions present absolute value of skew that is far less than 0.5 (Table 6).
Kurtosis characterizes the relative peakedness or flatness of a distribution compared with the normal distribution (Landam and Everitt 2004). A value of 0 represents a mesokurtic curve, more particularly, the bellshaped curve of the normal distribution (Daniel and Tennant 2001). Positive kurtosis indicates a relatively peaked (leptokurtic) distribution (Miliaresis and Paraschou 2005). The frequency distributions of July and August are rather leptokurtic ones since they do present kurtosis greater than 0.7 (Table 6). So RLST values are distributed more around the mean value for July and August. It is concluded that the regional increase of LST during the summer masks the regions of high thermal anomaly (positive or negative) and attenuates their difference from the surrounding land.
Thresholding of either very high (positive thermal anomaly) or very low (negative thermal anomaly) RSLT values (Fig. 6) on the basis of RSLT histogram frequency distributions (Fig. 7) can map the spatial distribution of the thermal anomaly pattern for each month.
Lets compare the vector visualizations for LST (Fig. 5) versus RLST data (Fig. 8). LST vectors presents a gradual increase of LST from January to July, followed by a gradual decrease in LST (Fig. 5). On the contrary RLST vectors present a residual bending (negative RLST anomaly) in Spring. The bending is verified by the negative mean values of RLST in Table 6. Table 6 also verifies that the bending is maximized in April. A tentative hypothesis is that snow melting and the associated water table depth fluctuation might be responsible for the seasonal bending of vectors. The geomorphology of the study area (elevated mountain ranges intervening desert basins) and the snowfall seasonal pattern (Wang and Gillies 2012) support this hypothesis. Miliaresis (2014a, b) observed a rather similar (in concept) LST pattern in Antarctica, that it was related to ice surface melting during the long Antarctic day (summer). Nevertheless, seasonal winds and air circulation pattern might also be responsible for the negative thermal anomaly bending.
Conclusion
The Selective Variance Reduction script took advantage of the Eig function of Octave 4.0 software that determines the eigenvectors and eigenvalues from the variance–covariance matrix. Thus, it is possible to apply elevation, latitude longitude decorrelation stretch of multitemporal monthly averaged night LST imagery for 2007 in SW USA, in attempt to quantify thermal anomalies in degrees Celsius. For each reconstructed LST imagery, the thermal anomaly value (RLST) per pixel expresses the LST deviation in degrees Celsius, from the elevation, latitude, longitude predicted LST. Under these circumstances, there is a direct correspondence in between the reconstructed LST values and both the magnitude and the sign (positive or negative) of thermal anomalies.
References
Barnett TP, Pierce DW (2008) When will lake Mead go dry? Water Resour Res 44:W03201. doi:10.1029/2007WR006704
Daniel C, Tennant K (2001) DEM quality assessment. In: Maune D (ed.) Digital elevation model technologies and applications: the DEM users manual. Bethesda: American Society for Photogrammetry and Remote Sensing, pp 395–440
Eig (2015) Function Eig. Octave. http://octave.sourceforge.net/octave/function/eig.html. Accessed 20 Jan 2016
Farr TG, Kobrick M (2000) Shuttle radar topography mission produces a wealth of data. Am Geophys Union EOS 81:583–585
Friedel MJ (2012) Datadriven modeling of surface temperature anomaly and solar activity trends. Environ Model Softw 37:217–232
Guo D, Westra S, Maier HR (2016) An R package for modelling actual, potential and reference evapotranspiration. Environ Model Softw 78:216–224
Jolliffe I (2002) Principal component analysis, 2nd edn. SpringerVerlag, New York
Landam S, Everitt BS (2004) A handbook for statistical analysis using SPSS. Chapman and Hall/CRC Press, New York
Li S, Zhao Z, Miaomiao X, Wang Y (2010) Investigating spatial nonstationary and scaledependent relationships between urban surface temperature and environmental factors using geographically weighted regression. Environ Model Softw 25:1789–1800
Liu T, Wang Z, Huang X, Cao L, Niu M, Tian Z (2015) An effective Antarctic ice surface temperature retrieval method for MODIS. Photogramm Eng Remote Sens 81:861–872
Mather PM, Koch M (2011) Computer processing of remotelysensed images, 4th edn. Wiley, New York
Miliaresis G (2009) Regional thermal and terrain modeling of the Afar Depression from multitemporal night LST data. Int J Remote Sens 30:2429–2446
Miliaresis G (2012a) Elevation, latitude/longitude decorrelation stretch of multitemporal LST imagery. Int J Remote Sens 33:6020–6034
Miliaresis G (2012b) Elevation, latitude/longitude decorrelation stretch of multitemporal LST imagery. Photogramm Eng Remote Sens 78:151–160
Miliaresis G (2012c) Selective variance reduction of multitemporal LST imagery in the East Africa Rift System. Earth Sci Inf 5:1–12
Miliaresis G (2013) Terrain analysis for active tectonic zone characterization, a new application for MODIS night LST (MYD11C2) dataset. Int J Geogr Inf Sci 27:1417–1432
Miliaresis G (2014a) Spatiotemporal patterns of land surface temperature of Antarctica from MODIS Monthly LST data (MYD11C2). J Spat Sci 59:157–166
Miliaresis G (2014b) Daily temperature oscillation enhancement of multitemporal LST imagery. Photogramm Eng Remote Sens 80:423–428
Miliaresis G, Paraschou Ch (2005) Vertical accuracy of the SRTM DTED level 1 of Crete. Int J Appl Earth Obs GeoInf 7:49–59
MYD11C2 (2011) AquaMODIS monthly LST imagery. https://lpdaac.usgs.gov/dataset_discovery/modis/modis_products_table/MYD11C2. Accessed 20 Jan 2016
MYD11C2.006 (2016) AquaMODIS monthly LST imagery, version 006. http://e4ftl01.cr.usgs.gov/MOLA/MYD11C2.006/. Accessed 20 Jan 2016
Octave (2015) GNU Octave Version 4.0. http://www.gnu.org/software/octave/. Accessed 20 Jan 2016
SRTM30 DEM (2015) SRTM30 Digital Elevation Model, Version 2.1. US Geological Survey. http://e4ftl01.cr.usgs.gov/SRTM/SRTMGL30.002/. Accessed 20 Jan 2016
SRTM30 DOC (2015) SRTM30 Documentation, US Geological Survey. http://dds.cr.usgs.gov/srtm/version2_1/SRTM30/srtm30_documentation.pdf. Accessed 20 Jan 2016
Voinov AA, Fitz C, Boumans R, Costanz R (2004) Modular ecosystem modeling. Environ Model Softw 19:285–304
Wan Z (2013) Collection6, MODIS land surface temperature products. Users’ guide. University of California, Santa Barbara. http://www.icess.ucsb.edu/modis/LstUsrGuide/MODIS_LST_products_Users_guide. Accessed 20 Jan 2016
Wan Z (2014) New refinements and validation of the Collection6 MODIS landsurface temperature/emissivity products. Remote Sens Environ 140:36–45
Wang S Y. Simon, Gillies RR (2012) Climatology of the US InterMountain West, In: Dr ShihYu Wang (ed.) Modern climatology. ISBN: 9789535100959, InTech. doi:10.5772/33940, pp 153–176. http://www.intechopen.com/books/modernclimatology/climatologyoftheusintermountainwest
Wilcke RAI, Bärring L (2016) Selecting regional climate scenarios for impact modelling studies. Environ Model Softw 78:191–201
Author information
Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Miliaresis, G.C. An unstandardized selective variance reduction script for elevation, latitude and longitude decorrelation stretch of multitemporal LST imagery. Model. Earth Syst. Environ. 2, 41 (2016). https://doi.org/10.1007/s4080801601030
Received:
Accepted:
Published:
Keywords
 Land surface temperature
 MODIS
 Thermal anomalies
 Spatial modeling
 Principal component analysis
 Octave