Encyclopedia of GIS

2017 Edition
| Editors: Shashi Shekhar, Hui Xiong, Xun Zhou

Semantic Kriging

  • Shrutilipi Bhattacharjee
  • Soumya K. Ghosh
Reference work entry
DOI: https://doi.org/10.1007/978-3-319-17885-1_1577

Synonyms

Definition

Advancement of technology in the field of remote sensing (RS) and geographic information system (GIS) has introduced a significant amount of research challenges. Proper staging of the spatial data is necessary as geospatial repositories contain missing and erroneous information. Therefore, prediction of meteorological parameters with better accuracy is an indispensable task required for most of the applications related to weather/climatological analysis. The geostatistical interpolation methods are often considered to be the most preferred and appropriate methods for the prediction of meteorological parameters (such as land surface temperature (LST), normalized difference vegetation index (NDVI), moisture stress index (MSI), etc.), which yield minimal error. The methods based on regression exhibit better performance as the autocorrelation within the region of interest (RoI) is modeled and incorporated in the regression process. The popular approaches for spatial interpolation include kriging, which is based on the notion of linear regression. Some well-known members of this kriging family are ordinary kriging (OK), simple kriging (SK), universal kriging (UK), etc.

Although existing regression-based interpolation methods show moderate performance for climatological parameters’ prediction, however, the semantic knowledge of the terrain is not considered in the regression process. When GIS deals with climatological parameters, different land cover classes exhibit different behaviors and influence these parameters in a varying manner. This entry focuses on a new scheme of spatial interpolation, namely, semantic kriging (SemK), which is first proposed in Bhattacharjee et al. (2014), followed by its extensions in Bhattacharjee and Ghosh (20142015). It is a member of univariate kriging family and based on the concept of mean square error minimization in prediction. It modifies the existing ordinary kriging (OK) method by incorporating semantic knowledge of the spatial land cover classes into the interpolation process. In SemK, the variations between the land covers are modeled using ontology (Gruber 1995; Bhattacharjee et al. 2012). It is used to capture the association between the spatial land cover classes (Bhattacharjee et al. 2012) and organize them into a hierarchy. This entry focuses on the following aspects SemK and discusses about its applicability in this domain.
  • Detailed description of the semantic kriging (SemK) approach with the following functionalities:
    • Capturing the semantic knowledge of the land cover classes within the RoI and their formal representation with land cover ontology

    • Measuring semantic similarity and a priori correlation (spatial importance) between each pair of leaf land cover classes in the ontology

    • Modifying the ordinary kriging process, considering the spatial importance and the semantic similarity between the spatial land covers

  • Mathematical modeling of the modified weight matrix and other related parameters of SemK

  • Performance evaluation and comparison of SemK with other interpolation methods with real LST data

Historical Background

Spatial interpolation methods have been applied in many disciplines, such as meteorology, ecology, soil sciences, water and marine science, etc. These methods are typically characterized as application oriented and data specific. The performance of a method may vary depending on the sample size, presence of outliers, surrounding conditions, type of application, etc. Among the deterministic interpolation methods, NN, IDW, and IDS are the widely used techniques. The family of kriging (e.g., OK, SemK (univariate), and UK or KED (multivariate)) is one of the popular stochastic interpolation techniques (refer Fig. 1).
Semantic Kriging, Fig. 1

Popularity of spatial interpolation methods

Several research articles have compared these methods for different geospatial applications. Nalder and Wein (1998) have compared three interpolation methods, namely, OK, NN, and IDS, for monthly temperature and precipitation estimation of Western Canada. They have observed that the NN method performs better than IDS, followed by OK. Van Kuilenburg et al. (1982) have compared these methods for their application in agriculture and soil science and found that OK is preferred over others. Similarly, Brus et al. (1996) have compared the performances of these methods for estimating different properties of soil and found OK to report minimal error. Ruddick (2007) has compared OK, NN, and IDW for the Australian seascape maps and found similar results for all. Nalder and Wein (1998) have reported that the IDS method and its modifications are the most applied spatial interpolation methods for different applications. Karydas et al. (2009) have also reported the similar fact. Franzen and Peck (1995) and Weisz et al. (1995) have reported kriging and IDS to be the two most widely used methods in GIS. In terms of accuracy, some literatures report that kriging outperforms IDW Kravchenko (2003), whereas some other studies report the latter to produce better result than kriging Nalder and Wein (1998). Kravchenko (2003) have compared the effect of spatial autocorrelation with two methods, OK and IDW, for the grid soil sampling, and have reported that kriging outperforms IDW. Some comparable results are also observed in the literatures. Schloeder et al. (2001) have reported both OK and IDW to perform with same accuracy. Yasrebi et al. (2009) have found OK to perform better than IDW, to determine the spatial variability of soil chemical. Mueller et al. (2004) have observed that IDW performed equally or better than kriging.

In 2008, Li and Heap (2008) have proposed a frequency graph, representing the frequency with which the popular interpolation methods have been compared and recommended in 51 reviewed comparative studies. This analysis is extended further in Li and Heap (2011) with more reported results. Similarly, we have extended this study with few more recent works and obtained a new popularity graph as shown in Fig. 1. Among fourteen candidate interpolation methods, ordinary kriging (OK) is the most popular technique, followed by inverse distance weighting (IDW), inverse distance squared (IDS), nearest neighbors (NN), and universal kriging (UK), respectively. Consulting the state of the art, it is evident that none of the existing interpolation method incorporates the land cover knowledge of the terrain into the prediction process of the climatological parameters. The semantic kriging approach, first proposed by Bhattacharjee et al. in 2014 (Bhattacharjee et al. 2014), quantifies this land cover-based semantic knowledge of the terrain and enhances the prediction accuracy by incorporating the same into the interpolation process.

Scientific Fundamentals

Several methods have been developed for spatial interpolation in various disciplines. Estimation processes of most of the spatial interpolation methods can be represented as the weighted average of the sampled locations. The general estimation equation of spatial interpolation can be given as follows:
$$ \hat{Z}(x_{0}) = \sum\limits_{i=1}^{N}w_{ i}Z(x_{i}) $$
where \(\hat{Z} (x_{0})\) is the predicted value of the parameter at the prediction point x0, Z(x i ) is the actual value at the interpolating point x i , w i is the weight assigned to the interpolating point x i , and N represents the number of interpolating points.

Ordinary Kriging (OK)

Kriging represents the family of least-square regression-based interpolation methods. It advances upon other interpolation techniques through modeling the underlying spatial autocorrelation among the interpolating points. The general kriging methods use the following equation for predicting the random field Z at x0:
$$ \hat{Z}(x_{0})-\mu =\sum\limits_{ i=1}^{N}w_{ i}[Z(x_{i}) -\mu (x_{0})] $$
where μ is the mean over the RoI and μ(x0) is the mean at the prediction point x0. The ordinary kriging assumes the first moment to be constant over the RoI, i.e., E{Z(x i )} = E{Z(x0)} ⇒ μ = μ(x0), where μ is unknown. In kriging, the assigned weight is calculated from the semivariogram model. The semivariance (γ(h)) provides the knowledge about the underlying spatial relationships between the random fields and the amount of autocorrelation with respect to the Euclidean distances between the sample points. It is half the variance of the differences between the random field values of all the interpolating points, separated by lag distance h. The semivariance (γ(h)) of the random field Z between two sample points (h distance apart) is defined as follows:
$$ \gamma (h) = \frac{\sum\limits_{i=1}^{N}[Z(x_{i}) - Z(x_{i} + h)]^{2}} {2M} $$
where γ(h) is the semivariance for the lag interval h, Z(x i ) is parameter value at a point x i , Z(x i + h) is measured parameter value at the point which is separated by lag distance h from x i , and M is the total number of sample points within lag interval h. A trend analysis plot of γ(h) against h generates the experimental semivariogram. This model is used to measure the covariances between all the sample points in the RoI. Hence, the spatial covariance is the function of Euclidean distance between sample points in two-dimensional space, which is calculated from the semivariogram.
Let 𝜖(x0) be the amount of error in estimation of the parameter value Z at x0. If Z(x0) and \(\hat{Z}(x_{0})\) are the actual and the predicted parameter values at x0, then 𝜖(x0) is given as follows:
$$ \begin{array}{lll} \epsilon (x_{0})& =& \hat{Z}(x_{0}) - Z(x_{0}) \\ & =& \sum\limits_{i=1}^{N}w_{ i}Z(x_{i}) - Z(x_{0}) \end{array}$$
As ordinary kriging assumes the random field to be stationary over the RoI, the expectation of the error should be zero. Therefore, the following holds:
$$ \begin{array}{rrr} E\left (\epsilon (x_{0})\right )& =& 0 \\ \sum\limits_{i=1}^{N}w_{ i} \times E(Z(x_{i})) - E(Z(x_{0}))& =& 0 \\ \mu \sum\limits_{i=1}^{N}w_{ i}-\mu & =& 0 \\ \sum\limits_{i=1}^{N}w_{i}& =& 1 \\ {\mathbf 1}^{T}\mathbf{W}& =& 1 \end{array}$$
Thus, the general estimation equation of ordinary kriging can be given as follows:
$$ \hat{Z}(x_{0}) =\sum\limits_{ i=1}^{N}w_{ i}Z(x_{i}) $$
constrained by 1 T W = 1, where W is the vector of size N and is given as [w1w2w N ] T .

Ontology and Tree Representation

Ontology is a concept from philosophy that has been extended in many fields of study for representing a domain with its “categories of being” and their “interrelationships.” In computer science and information science, ontology is mainly used for the knowledge representation of a particular “domain of discourse” and defining different concepts of the system. It formally represents the knowledge as a set of classes, individuals, parameters, relations, function terms, restrictions, rules, axioms, events, etc. Gruber et al. have mentioned that “ontologies are often equated with taxonomic hierarchies of classes, class definitions, and the subsumption relation…” (Gruber 1995). Among different types of representations of ontologies (such as tree, graph, network, hierarchy, etc.), the hierarchical ontology exhibits some important properties, for which it is considered as the most appropriate choice for representing the domain knowledge. Some of the properties can be specified as follows:
  • Semantic relations (e.g., hypernym, hyponym, meronym) can be used for building the ontology hierarchy.

  • The inheritance hierarchies can be formed from the ontology itself.

  • Hierarchical ontologies can be nested.

  • Reasoning of ontology hierarchy provides the amount of association between concepts.

In semantic kriging (SemK), the traditional spatial interpolation process is extended with semantic land cover knowledge for improving prediction of the meteorological parameters. This knowledge of the terrain is represented by the hierarchical ontology of land cover distribution. The concept is pictorially represented in Fig. 2.
Semantic Kriging, Fig. 2

Ontology and semantic kriging

Semantic Kriging (SemK)

The semantic kriging (SemK) extends ordinary kriging (OK) and combines the knowledge of semantics and the correlation between the spatial land cover classes of the terrain into the interpolation process. In OK, the covariance and the semivariogram are the functions of distance (Euclidean distance in two-dimensional space) and independent of the influence of nearby spatial land cover distribution. However, this knowledge of the terrain is not considered by the existing kriging processes. The SemK maps the traditional covariance function to higher dimension by blending the semantic knowledge of the nearby land covers for more informative estimation at the prediction point. Figure 3 presents the underlying notion of SemK in comparison with other existing interpolation methods. In case of existing non-geostatistical methods (e.g., NN, IDW, IDS, etc.), the interpolating points (represented with the solid circles in Fig. 3a, b) are spatially related with the prediction point (represented with the filled star). For the existing geostatistical methods (OK, UK, etc.), the interpolating points are considered to be spatially related with each other. In SemK (refer Fig. 3c), the sample points are spatially, as well as semantically, related, as each of them is represented by their representative land covers (e.g., building, vegetation cover, industry, water body, etc.). In SemK, following Tobler’s law of geographic proximity (Tobler 1970), the distinction between land covers is done such that the semantically similar and correlated land cover will have more weight than the distant one. The overall SemK framework is depicted in Fig. 4.
Semantic Kriging, Fig. 3

Working principles of popular spatial interpolation methods

Semantic Kriging, Fig. 4

Semantic kriging (SemK) framework

First, a spatial land cover ontology is constructed with all possible land cover classes of the spatial region of interest (RoI) with the help of domain experts. These land covers are represented as the concepts in the ontology, which are organized into a hierarchy based on some standard semantic relations, such as, hyponym, meronym, etc. According to the property of hierarchical ontology, the semantically similar land covers will be closer in the hierarchy than the dissimilar one (in terms of ontological hop distance). The spatial region Kolkata (a metropolitan city in India, central coordinate: 22. 567 N 88. 367 E) has several spatial land cover classes, namely, built-up, agriculture, forest, wastelands, water bodies, wetlands, etc. They are further organized into an ontology hierarchy that is depicted in Fig. 5. It is constructed using “is-a” (hyponym) relation.
Semantic Kriging, Fig. 5

Spatial land cover ontology (for LST prediction)

The ontology is domain and region specific, i.e., it depends on the nature of RoI and the prediction parameter. For different domains of applications, the ontology hierarchy structure varies with the type of concept it is representing, the number of participating concepts, relations, etc. It is adaptive in nature too. In case of spatial interpolation, it is evident that each of the sample point must correspond to one of the leaf land covers (land cover represented by the leaf concept) in the ontology. The sample points are further mapped to the most appropriate representative leaf land cover class in the hierarchy, and the amount of association between the spatial land cover classes in the ontology is evaluated further. It can be measured in two ways: by the evaluation of spatial importance between a pair of leaf land covers and the semantic similarity analysis between them. These processes are termed as spatial importance measurement and semantic similarity measurement, respectively. These two parameters (semantic similarity and spatial importance) modify and map the traditional covariance measure of OK into higher dimension. In ordinary kriging, the assigned weights are the function of Euclidean distance only. The newly assigned weight in SemK is the function of distance as well as the semantic property of the terrain. As the covariance gets modified, the weight assigned by the ordinary kriging to each of the interpolating points also gets modified. Further, the weights are normalized to predict the parameter value. As SemK has more number of decision parameters than OK, the former is a more informative prediction process than the latter (Bhattacharjee and Ghosh 2014).

Spatial Importance Measurement

The spatial importance between each pair of leaf land cover classes in the ontology is measured by the correlation analysis between them, with respect to the prediction parameter. In this regard, the whole RoI is subdivided into k number of non-overlapping zones, such that \(\bigcup\nolimits_{i=1}^{k}R_{k} = RoI\). In order to carry out a pairwise correlation analysis, k pairs of sample points are chosen from each of the zones. In each of the pair, the former element represents the first land cover type, and the latter represents the second type. These points are chosen by obeying the law of geographic proximity, i.e., for being influenced by each other, the pair must reside within a predefined distance d. For our study, d is chosen as 5 km. The value of this correlation metric ranges between [−1, 1] and is further normalized to a positive range (e.g., [1, 3]) to avoid the negative mapping of the covariances. This study exhibits the following properties:

  • The correlation analysis is dependent on the primary parameter to be predicted, i.e., the correlation value between two land cover classes in a particular RoI is not same for all the parameters.

  • It is a priori correlation, i.e., the correlation between a pair of land cover classes is determined without considering impacts of other nearby land covers.

  • It is a global correlation analysis, i.e., the correlation score between a pair is constant for the whole study region.

This correlation measure between a pair of land cover classes is also termed as relative spatial importance. For mathematical representation of this metric, let the representative land cover of prediction point x0 and the interpolating point x i be f0 and f i , respectively. Let the correlation between x0 and x i be se i , which is given as follows:
$$ \begin{array}{lll} se_{i} = Corr_{\mathrm{prediction\_parameter}}(x_{0},x_{i}) \\ = Corr_{\mathrm{prediction\_parameter}}(f_{0},f_{i}) \\ = \frac{\sum\limits_{m=1}^{k}(Z(f_{0_{m}}) -\overline{Z(f_{0})})(Z(f_{i_{m}}) -\overline{Z(f_{i})})} {\sqrt{\sum\limits_{m=1 }^{k }\!\!(Z(f_{0_{m}})- \overline{Z(f_{0})})^{2 } \sum\limits_{m=1 }^{k }(Z(f_{i_{m}})- \overline{Z(f_{i})})^{2}}} \end{array}$$
where \(Z(f_{p_{q}})\) represents the random field value of the qth sample point, representing the land cover f p ; \(\overline{Z(f_{p})}\) represents the average of the random field values of the land cover f p over k sample points. For all the interpolating points, it forms an [N × 1] vector, given as SI T = [se1se2se N ].
Similarly, due to spatial autocorrelation, the relative spatial importance must be measured between each pair of interpolating points as well. The relative spatial importance between ith and jth interpolating points, se ij , is given as follows:
$$ \begin{array}{lll} se_{ij} = Corr_{\mathrm{prediction{\_}parameter}}(x_{i},x_{j}) \\ = Corr_{\mathrm{prediction{\_}parameter}}(f_{i},f_{j}) \\ = \frac{\sum\limits_{m=1}^{k}(Z(f_{i_{m}})-\overline{Z(f_{i})})(Z(f_{j_{m}}) -\overline{Z(f_{j})})} {\sqrt{\sum\limits_{m=1 }^{k }(Z(f_{i_{m}})- \overline{Z(f_{i})})^{2} \sum\limits_{m=1}^{k}(Z(f_{j_{m}})- \overline{Z(f_{j})})^{2}}} \end{array}$$
Hence, for N interpolating points, an [N × N] symmetric matrix is formed, termed as spatial importance matrix and denoted as W2.

Semantic Similarity Measurement

The semantic similarity measure between any two sample points or their representative land covers in the ontology is measured by analyzing the hierarchy structure, using modified context resemblance method (Bhattacharjee et al. 2014). This measure follows Tobler’s law of geographic proximity, i.e., higher distant (with respect to ontological hop distance) land cover classes in the ontology will be less similar and vice versa. The semantic similarity between the prediction point x0 and the ith interpolating point x i is referred as sd i and is given as follows:
$$ sd_{i} = \frac{ \frac{m_{0}} {\vert f_{0}\vert } + \frac{m_{i}} {\vert f_{i}\vert }} {2} $$
where | f0 | and | f i | are the total number of concepts in the prediction point’s (x0) land cover path and ith interpolating point’s land cover path, respectively. Here, m0 and m i (where m0 = m i ) are the number of concepts matching in their paths. With reference to the prediction point, it forms an [N × 1] vector for all the interpolating points. It is given as SD T = [sd1sd2sd N ].
Similarly, due to the presence of spatial autocorrelation, the relative semantic similarities are supposed to be calculated between all the pairs of interpolating points as well. The relative semantic similarity between ith and jth interpolating points, x i and x j , is referred as sd ij . It is measured with the following equation:
$$ sd_{ij} = \frac{ \frac{m_{i}} {\vert f_{i}\vert } + \frac{m_{j}} {\vert f_{j}\vert}} {2} $$
where | f i | and | f j | are the total number of concepts in the land cover paths of the ith and jth interpolating points, respectively, and m i and m j (where m i = m j ) are the number of concepts matching in their paths. For all the interpolating points, it forms an [N × N] symmetric matrix, termed as semantic similarity matrix, and is denoted as W3.

These four matrices (W2[N×N], W3[N×N], SI[N×1], and SD[N×1]) modify the covariance matrix (C) and the distance matrix (D) of OK.

Theoretical Error Analysis of Semantic Kriging

This section presents a brief theoretical analysis and formalization of the amount of error reported by SemK and different other parameters and constraints. For both OK and SemK, let the actual value at the prediction point x0 be Z(x0). The prediction is supposed to be carried out with respect to N number of known interpolating points, {Z(x1), ⋯, Z(x N )}, where Z(x i ) is the actual parameter value at the point x i and \( \hat{Z} (x_{0})\) is the predicted parameter value at the prediction point x0. For ordinary kriging, the two traditional matrices, namely, covariance matrix (C) and distance matrix (D), are defined as follows:

$$ \begin{array}{lll} \mathbf{C} = \\ \left [\begin{array}{cccc} V ar(Z_{1}) & Cov(Z_{1},Z_{2}) &\cdots &Cov(Z_{1},Z_{N}) \\ Cov(Z_{2},Z_{1}) & V ar(Z_{2}) &\cdots &Cov(Z_{2},Z_{N})\\ \vdots & \vdots & \ddots & \vdots \\ Cov(Z_{N},Z_{1})&Cov(Z_{N},Z_{2})&\cdots & V ar(Z_{N}) \end{array} \right ] \end{array}$$
$$ \mathbf{D} = \left[\begin{array}{cccc} Cov\{Z_{1},Z_{0}(r)\} \\ Cov\{Z_{2},Z_{0}(r)\}\\ \vdots \\ Cov\{Z_{N},Z_{0}(r)\} \end{array} \right] $$
where Cov(Z i , Z j ) is the covariance between Z(x i ) and Z(x j ) and Var(Z i , Z i ) denotes the covariance of Z(x i ) with itself, i.e., variance. For simplicity, Z(x i ) is written as Z i . Here, the traditional covariance matrix, C, can also be regarded as the first weight component (W1[N×N]) of SemK, which is inherited from OK. In ordinary kriging, the weight vector W is measured with respect to covariance matrixC and the distance matrixD. Similar to OK, the SemK aims to optimize the weight vector W such that the estimation variance \(\sigma _{E}^{2} = E([Z_{0} -\hat{ Z_{0}}]^{2})\) is minimized.
In SemK, the covariance function includes the uncertainty of the local variables, such as the impact of the surrounding spatial land covers’ distribution. The covariance between the sample points is now mapped to higher dimension by blending the local properties of the terrain. In this work, the semantic properties between the land covers are captured by four matrix components W2, W3, SI, and SD. As the covariance is a measure of the variance of the difference between the parameter values, both the proposed semantic metrics, i.e., the semantic similarity and the spatial importance, are inversely proportional to the traditional covariance between random fields. Hence, for SemK, the modified covariance between the ith and jth sample points is represented as \(\frac{C_{ij}} {se_{ij}\cdot sd_{ij}}\). The significance of this mapping can be stated as follows: being in the same distance from the prediction point, the covariance between any two interpolating points may be different based on the semantic property between them. The covariance increases if the semantic measure is less and vice versa. Therefore, the modified covariance matrixC and the modified distance matrixD of SemK are given as follows:
$$ \begin{array}{lll} {\mathbf C^{\prime}}& =& \mathop{\mathop{-. -. -}\limits_{({\mathbf W_{2}} \circ {\mathbf W_{3}})}}\limits^{\mathbf{C}} \\ {\mathbf D^{\prime}}& =& \mathop{\mathop{-. -. -}\limits_{ (\mathbf{SI} \circ \mathbf{SD})}}\limits^{\mathbf{D}} \end{array}$$
where “∘” and “ −. −. −” denote the Hadamard product and Hadamard division between matrices, respectively.
The weight matrix of SemK of dimension [N × 1] is denoted as W. Similar to the ordinary kriging, the SemK also assumes the mean to be constant over the whole region. Let the mean square error at x0 be given as σ SemK 2. Hence, E(σ S 2emK) = 0 ⇒ 1 T W = 1 (Bhattacharjee et al. 2014). Therefore, the mathematical expression for σ SemK 2 is given as follows:
$$ \sigma _{SemK}^{2} = C_{ 00}^{\prime} + \mathbf{W}^{{\prime}T}{\mathbf C^{\prime}}{\mathbf W^{\prime}}- 2\mathbf{W}^{{\prime}T}{\mathbf D^{\prime}} $$
where \(C_{00}^{\prime} = \frac{C_{00}} {(se_{00}\cdot sd_{00})}\), C00 is Cov{Z0(r), Z0(r)}, and se00 and sd00 are the relative spatial importance and relative semantic similarity between f0 with itself, respectively. As se00 represents the correlation of a random field with itself, se00 = 1. Similarly, from the ontology hierarchy in Fig. 5, the value of sd00 is also 1. Being a least-square regression algorithm, the SemK tries to minimize the error by minimizing the following equation:
$$ C_{00}^{\prime} + \mathbf{W}^{{\prime}T}{\mathbf C^{\prime}}{\mathbf W^{\prime}}- 2\mathbf{W}^{{\prime}T}{\mathbf D^{\prime}};\ni \mathbf{W}^{{\prime}T}{\mathbf 1} = 1 $$
To solve it without constraints, the Lagrange multiplier − 2λ is introduced to the error expression. If K is the unconstrained error expression for SemK, then K is given as follows:
$$ K = C_{00}^{\prime}+\mathbf{W}^{{\prime}T}{\mathbf C^{\prime}}{\mathbf W^{\prime}}-2\mathbf{W}^{{\prime}T}{\mathbf D^{\prime}}+2\lambda (\mathbf{W}^{{\prime}T}{\mathbf 1}-1) $$
Solving the above equation, the weight vector of SemK: W and λ (Bhattacharjee et al. 2014) can be expressed as follows:
$$ {\mathbf W^{\prime}} = [\mathop{\mathop{-. -. -}\limits_{ ({\mathbf W_{ 2}} \circ {\mathbf W_{3}})}}\limits^{\mathbf{C}}]^{-1}[[\mathop{\mathop{-. -. -}\limits_{ (\mathbf{SI} \circ \mathbf{SD})}}\limits^{\mathbf{D}}] -\lambda {\mathbf 1}] $$
$$ \lambda = \frac{{\mathbf 1}^{T}[\mathop{\mathop{-. -. -}\limits_{ ({\mathbf W_{2}} \circ {\mathbf W_{3}})}}\limits^{\mathbf{C}}]^{-1}[\mathop{\mathop{-. -. -}\limits_{ (\mathbf{SI} \circ \mathbf{SD})}}\limits^{\mathbf{D}}] - 1} {{\mathbf 1}^{T}[\mathop{\mathop{-. -. -}\limits_{ ({\mathbf W_{2}} \circ {\mathbf W_{3}})}}\limits^{\mathbf{C}}]^{-1}{\mathbf 1}} $$
Therefore, the predicted parameter value \(\hat{Z}(x_{0})\) at the prediction point x0 is expressed as \(\hat{Z} (x_{0}) =\sum\limits_{ i=1}^{N}w_{ i}^{\prime}Z(x_{ i})\), where w i is the weight assigned to the ith interpolating point by SemK. The minimum variance of error in SemK (Bhattacharjee et al. 2014) is expressed as follows:
$$ \begin{array}{lll} \sigma _{SemK}^{2}& =& C_{00}^{\prime} + \mathbf{W}^{{\prime}T}{\mathbf D^{\prime}}-\lambda \\ & =& \frac{C_{00}} {(se_{00} \cdot sd_{00})} + \mathbf{W}^{{\prime}T}[\mathop{\mathop{-. -. -}\limits_{(\mathbf{SI} \circ \mathbf{SD})}}\limits^{\mathbf{D}}] -\lambda \end{array}$$

Key Applications

The spatial interpolation is the most widely used prediction technique in the field of remote sensing and geographic information system. As SemK deals with the semantic knowledge of the terrain, it has major significances in geospatial applications, where the prediction process can be improved by the incorporation of other semantic knowledge. For example, the knowledge of land cover distribution plays an important role in the prediction of land surface temperature (LST) of a region.

Hence, to check the performance of SemK in terms of generating the predicted surface, the real land surface temperature data, obtained by processing the satellite imagery, has been considered for our experimentation. The Landsat ETM+ satellite data of the United States Geological Survey (USGS)1 has been used. The mapping imagery of five spatial zones in the spatial region Kolkata, India, is tabularized in Table 1. The bounding box information of each of the zones is specified in the table as [lower-left corner, upper-right corner]. The actual LST imagery along with the predicted imagery (by SemK, OK, and UK) for all the zones is shown in Table 1. It may be observed from the imagery that the SemK, with land cover information, produces better predicted surface (LST distribution of the terrain) than that of produced by OK and others. The SemK also reports higher peak signal-to-noise ratio (PSNR) over OK and UK ( ≈ 4–10 dB).
Semantic Kriging, Table 1

Comparison study of SemK prediction method with other popular kriging methods for five spatial zones in the city of Kolkata, India [(a) Actual LST imagery and the predicted imagery using (b)SemK, (c)OK, (d)UK]

Zone

(a) Actual LST data

(b) SemK prediction

(c) OK prediction

(d) UK prediction

 

Zone 1

Center coordinate: [(8824 33. 00 ′ ′ E 2256 38. 72 ′ ′ N); (8828 20. 72 ′ ′ E 2259 29. 30 ′ ′ N)]

 
 

Open image in new window

Open image in new window

Open image in new window

Open image in new window

 
  

PSNR = 33.30 dB

PSNR = 27.08 dB

PSNR = 24.68 dB

 

Zone 2

Center coordinate: [(8814 55. 00 ′ ′ E 2244 32. 97 ′ ′ N); (8818 42. 19 ′ ′ E 2247 24. 16 ′ ′ N)]

 
 

Open image in new window

Open image in new window

Open image in new window

Open image in new window

 
  

PSNR = 36.13 dB

PSNR = 28.33 dB

PSNR = 26.28 dB

 

Zone 3

Center coordinate: [(8823 52. 94 ′ ′ E 2241 38. 58 ′ ′ N); (8827 40. 21 ′ ′ E 2244 28. 86 ′ ′ N)]

 
 

Open image in new window

Open image in new window

Open image in new window

Open image in new window

 
  

PSNR = 34.49 dB

PSNR = 28.20 dB

PSNR = 23.36 dB

 

Zone 4

Center coordinate: [(8820 53. 51 ′ ′ E 2229 43. 04 ′ ′ N); (8824 40. 77 ′ ′ E 2232 33. 75 ′ ′ N)]

 
 

Open image in new window

Open image in new window

Open image in new window

Open image in new window

 
  

PSNR = 33.89 dB

PSNR = 29.28 dB

PSNR = 28.38 dB

 

Zone 5

Center coordinate: [(8824 28. 58 ′ ′ E 2224 5. 23 ′ ′ N); (8828 15. 37 ′ ′ E 2226 55. 87 ′ ′ N)]

 
 

Open image in new window

Open image in new window

Open image in new window

Open image in new window

 
  

PSNR = 36.07 dB

PSNR = 31.05 dB

PSNR = 28.94 dB

 
Some other key applications of this method are given as follows:
  • The basic building blocks of SemK can be customized with respect to any temporal, multivariate meteorological analysis, to achieve better accuracy.

  • The underlying semantic knowledge of the terrain is significant for different meteorological events (such as analysis of urban heat islands), urban planning, etc. The SemK can be utilized for modeling the temporal dynamism of these events.

  • Though SemK models the land cover information of the terrain as the semantics of the sample points, it follows a generic framework. The SemK procedure and its semantic metrics can be used for quantifying any new and influencing knowledge and for incorporating them into the spatial analysis to achieve better accuracy.

Future Directions

The SemK attempts to incorporate the semantic knowledge of the terrain in the interpolation process. It has various scopes of applications in the field of meteorology. The method can be extended in the following future directions:
  • Extending the SemK method for time-series prediction and forecasting

  • To study the inter-parameter spatio-semantic relationships between the existing parameters in the repository, for predicting the correlated implicit parameters

  • To model the land-atmospheric interaction for predicting the future locations of some geospatial phenomena (e.g., urban heat islands)

Cross-References

Footnotes

References

  1. Bhattacharjee S, Dwivedi A, Prasad RR, Ghosh SK (2012) Ontology based spatial clustering framework for implicit knowledge discovery. In: Annual IEEE India conference (INDICON), pp 561–566, KochiGoogle Scholar
  2. Bhattacharjee S, Ghosh SK (2014) Performance evaluation of semantic kriging: a Euclidean vector analysis approach. IEEE Geosci Remote Sens Lett 12(6): 1185–1189CrossRefGoogle Scholar
  3. Bhattacharjee S, Ghosh SK (2015) Time-series augmentation of semantic kriging for the prediction of meteorological parameters. In: IEEE international geoscience and remote sensing symposium (IGARSS), Milan, pp 4562–4565Google Scholar
  4. Bhattacharjee S, Mitra P, Ghosh SK (2014) Spatial interpolation to predict missing attributes in GIS using semantic kriging. IEEE Trans Geosci Remote Sens 52(8):4771–4780CrossRefGoogle Scholar
  5. Bhattacharjee S, Prasad RR, Dwivedi A, Dasgupta A, Ghosh SK (2012) Ontology based framework for semantic resolution of geospatial query. In: 12th international conference on intelligent systems design and applications (ISDA), Kochi, pp 437–442Google Scholar
  6. Brus DJ, de Gruijter JJ, Marsman BA, Visschers BA, Bregt AK, Breeuwsma A (1996) The performance of spatial interpolation methods and Choropleth maps to estimate properties at points: a soil survey case study. Environmetrics 7:1–16CrossRefGoogle Scholar
  7. Franzen DW, Peck TR (1995) Field soil sampling density for variable rate fertilization. J Prod Agric 8(4): 568–574CrossRefGoogle Scholar
  8. Gruber TR (1995) Toward principles for the design of ontologies used for knowledge sharing? Int J Hum Comput Stud 43(5):907–928CrossRefGoogle Scholar
  9. Karydas CG, Gitas IZ, Koutsogiannaki E, Lydakis-Simantiris N, Silleos G et al (2009) Evaluation of spatial interpolation techniques for mapping agricultural topsoil properties in crete. EARSeL eProceedings 8(1):26–39Google Scholar
  10. Kravchenko A (2003) Influence of spatial structure on accuracy of interpolation methods. Soil Sci Soc Am 67(5):1564–1571CrossRefGoogle Scholar
  11. Li J, Heap AD (2008) A review of spatial interpolation methods for environmental scientists. Geoscience, CanberraGoogle Scholar
  12. Li J, Heap AD (2011) A review of comparative studies of spatial interpolation methods in environmental sciences: performance and impact factors. Ecol Inf 6(3):228–241CrossRefGoogle Scholar
  13. Mueller T, Pusuluri N, Mathias K, Cornelius P, Barnhisel R, Shearer S (2004) Map quality for ordinary kriging and inverse distance weighted interpolation. Soil Sci Soc Am 68(6):2042–2047CrossRefGoogle Scholar
  14. Nalder IA, Wein RW (1998) Spatial interpolation of climatic normals: test of a new method in the Canadian boreal forest. Agri For Meteorol 92(4):211–225CrossRefGoogle Scholar
  15. Ruddick R (2007) Data interpolation methods in the geoscience Australia seascape maps. Geoscience, CanberraGoogle Scholar
  16. Schloeder C, Zimmerman N, Jacobs M (2001) Comparison of methods for interpolating soil properties using limited data. Soil Sci Soc Am 65(2):470–479CrossRefGoogle Scholar
  17. Tobler WR (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46:234–240CrossRefGoogle Scholar
  18. Van Kuilenburg J, De Gruijter JJ, Marsman BA, Bouma J (1982) Accuracy of spatial interpolation between point data on soil moisture supply capacity, compared with estimates from mapping units. Geoderma 27:311–325CrossRefGoogle Scholar
  19. Weisz R, Fleischer S, Smilowitz Z (1995) Map generation in high-value horticultural integrated pest management: appropriate interpolation methods for site-specific pest management of Colorado potato beetle (Coleoptera: Chrysomelidae). J Econ Entomol 88(6):1650–1657CrossRefGoogle Scholar
  20. Yasrebi J, Saffari M, Fathi H, Karimian N, Moazallahi M, Gazni R et al (2009) Evaluation and comparison of ordinary kriging and inverse distance weighting methods for prediction of spatial variability of some soil chemical parameters. Res J Biol Sci 4(1):93–102Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology KharagpurKharagpurIndia