Semantic Kriging
Synonyms
Definition
Advancement of technology in the field of remote sensing (RS) and geographic information system (GIS) has introduced a significant amount of research challenges. Proper staging of the spatial data is necessary as geospatial repositories contain missing and erroneous information. Therefore, prediction of meteorological parameters with better accuracy is an indispensable task required for most of the applications related to weather/climatological analysis. The geostatistical interpolation methods are often considered to be the most preferred and appropriate methods for the prediction of meteorological parameters (such as land surface temperature (LST), normalized difference vegetation index (NDVI), moisture stress index (MSI), etc.), which yield minimal error. The methods based on regression exhibit better performance as the autocorrelation within the region of interest (RoI) is modeled and incorporated in the regression process. The popular approaches for spatial interpolation include kriging, which is based on the notion of linear regression. Some wellknown members of this kriging family are ordinary kriging (OK), simple kriging (SK), universal kriging (UK), etc.
 Detailed description of the semantic kriging (SemK) approach with the following functionalities:

Capturing the semantic knowledge of the land cover classes within the RoI and their formal representation with land cover ontology

Measuring semantic similarity and a priori correlation (spatial importance) between each pair of leaf land cover classes in the ontology

Modifying the ordinary kriging process, considering the spatial importance and the semantic similarity between the spatial land covers


Mathematical modeling of the modified weight matrix and other related parameters of SemK

Performance evaluation and comparison of SemK with other interpolation methods with real LST data
Historical Background
Several research articles have compared these methods for different geospatial applications. Nalder and Wein (1998) have compared three interpolation methods, namely, OK, NN, and IDS, for monthly temperature and precipitation estimation of Western Canada. They have observed that the NN method performs better than IDS, followed by OK. Van Kuilenburg et al. (1982) have compared these methods for their application in agriculture and soil science and found that OK is preferred over others. Similarly, Brus et al. (1996) have compared the performances of these methods for estimating different properties of soil and found OK to report minimal error. Ruddick (2007) has compared OK, NN, and IDW for the Australian seascape maps and found similar results for all. Nalder and Wein (1998) have reported that the IDS method and its modifications are the most applied spatial interpolation methods for different applications. Karydas et al. (2009) have also reported the similar fact. Franzen and Peck (1995) and Weisz et al. (1995) have reported kriging and IDS to be the two most widely used methods in GIS. In terms of accuracy, some literatures report that kriging outperforms IDW Kravchenko (2003), whereas some other studies report the latter to produce better result than kriging Nalder and Wein (1998). Kravchenko (2003) have compared the effect of spatial autocorrelation with two methods, OK and IDW, for the grid soil sampling, and have reported that kriging outperforms IDW. Some comparable results are also observed in the literatures. Schloeder et al. (2001) have reported both OK and IDW to perform with same accuracy. Yasrebi et al. (2009) have found OK to perform better than IDW, to determine the spatial variability of soil chemical. Mueller et al. (2004) have observed that IDW performed equally or better than kriging.
In 2008, Li and Heap (2008) have proposed a frequency graph, representing the frequency with which the popular interpolation methods have been compared and recommended in 51 reviewed comparative studies. This analysis is extended further in Li and Heap (2011) with more reported results. Similarly, we have extended this study with few more recent works and obtained a new popularity graph as shown in Fig. 1. Among fourteen candidate interpolation methods, ordinary kriging (OK) is the most popular technique, followed by inverse distance weighting (IDW), inverse distance squared (IDS), nearest neighbors (NN), and universal kriging (UK), respectively. Consulting the state of the art, it is evident that none of the existing interpolation method incorporates the land cover knowledge of the terrain into the prediction process of the climatological parameters. The semantic kriging approach, first proposed by Bhattacharjee et al. in 2014 (Bhattacharjee et al. 2014), quantifies this land coverbased semantic knowledge of the terrain and enhances the prediction accuracy by incorporating the same into the interpolation process.
Scientific Fundamentals
Ordinary Kriging (OK)
Ontology and Tree Representation

Semantic relations (e.g., hypernym, hyponym, meronym) can be used for building the ontology hierarchy.

The inheritance hierarchies can be formed from the ontology itself.

Hierarchical ontologies can be nested.

Reasoning of ontology hierarchy provides the amount of association between concepts.
Semantic Kriging (SemK)
The ontology is domain and region specific, i.e., it depends on the nature of RoI and the prediction parameter. For different domains of applications, the ontology hierarchy structure varies with the type of concept it is representing, the number of participating concepts, relations, etc. It is adaptive in nature too. In case of spatial interpolation, it is evident that each of the sample point must correspond to one of the leaf land covers (land cover represented by the leaf concept) in the ontology. The sample points are further mapped to the most appropriate representative leaf land cover class in the hierarchy, and the amount of association between the spatial land cover classes in the ontology is evaluated further. It can be measured in two ways: by the evaluation of spatial importance between a pair of leaf land covers and the semantic similarity analysis between them. These processes are termed as spatial importance measurement and semantic similarity measurement, respectively. These two parameters (semantic similarity and spatial importance) modify and map the traditional covariance measure of OK into higher dimension. In ordinary kriging, the assigned weights are the function of Euclidean distance only. The newly assigned weight in SemK is the function of distance as well as the semantic property of the terrain. As the covariance gets modified, the weight assigned by the ordinary kriging to each of the interpolating points also gets modified. Further, the weights are normalized to predict the parameter value. As SemK has more number of decision parameters than OK, the former is a more informative prediction process than the latter (Bhattacharjee and Ghosh 2014).
Spatial Importance Measurement
The spatial importance between each pair of leaf land cover classes in the ontology is measured by the correlation analysis between them, with respect to the prediction parameter. In this regard, the whole RoI is subdivided into k number of nonoverlapping zones, such that \(\bigcup\nolimits_{i=1}^{k}R_{k} = RoI\). In order to carry out a pairwise correlation analysis, k pairs of sample points are chosen from each of the zones. In each of the pair, the former element represents the first land cover type, and the latter represents the second type. These points are chosen by obeying the law of geographic proximity, i.e., for being influenced by each other, the pair must reside within a predefined distance d. For our study, d is chosen as 5 km. The value of this correlation metric ranges between [−1, 1] and is further normalized to a positive range (e.g., [1, 3]) to avoid the negative mapping of the covariances. This study exhibits the following properties:

The correlation analysis is dependent on the primary parameter to be predicted, i.e., the correlation value between two land cover classes in a particular RoI is not same for all the parameters.

It is a priori correlation, i.e., the correlation between a pair of land cover classes is determined without considering impacts of other nearby land covers.

It is a global correlation analysis, i.e., the correlation score between a pair is constant for the whole study region.
Semantic Similarity Measurement
These four matrices (W_{2}_{[N×N]}, W_{3}_{[N×N]}, SI_{[N×1]}, and SD_{[N×1]}) modify the covariance matrix (C) and the distance matrix (D) of OK.
Theoretical Error Analysis of Semantic Kriging
This section presents a brief theoretical analysis and formalization of the amount of error reported by SemK and different other parameters and constraints. For both OK and SemK, let the actual value at the prediction point x_{0} be Z(x_{0}). The prediction is supposed to be carried out with respect to N number of known interpolating points, {Z(x_{1}), ⋯, Z(x_{ N })}, where Z(x_{ i }) is the actual parameter value at the point x_{ i } and \( \hat{Z} (x_{0})\) is the predicted parameter value at the prediction point x_{0}. For ordinary kriging, the two traditional matrices, namely, covariance matrix (C) and distance matrix (D), are defined as follows:
Key Applications
The spatial interpolation is the most widely used prediction technique in the field of remote sensing and geographic information system. As SemK deals with the semantic knowledge of the terrain, it has major significances in geospatial applications, where the prediction process can be improved by the incorporation of other semantic knowledge. For example, the knowledge of land cover distribution plays an important role in the prediction of land surface temperature (LST) of a region.
Comparison study of SemK prediction method with other popular kriging methods for five spatial zones in the city of Kolkata, India [(a) Actual LST imagery and the predicted imagery using (b)SemK, (c)OK, (d)UK]
Zone  (a) Actual LST data  (b) SemK prediction  (c) OK prediction  (d) UK prediction  

Zone 1  Center coordinate: [(88^{∘}24^{ ′ }33. 00^{ ′ ′ }E 22^{∘}56^{ ′ }38. 72^{ ′ ′ }N); (88^{∘}28^{ ′ }20. 72^{ ′ ′ }E 22^{∘}59^{ ′ }29. 30^{ ′ ′ }N)]  
PSNR = 33.30 dB  PSNR = 27.08 dB  PSNR = 24.68 dB  
Zone 2  Center coordinate: [(88^{∘}14^{ ′ }55. 00^{ ′ ′ }E 22^{∘}44^{ ′ }32. 97^{ ′ ′ }N); (88^{∘}18^{ ′ }42. 19^{ ′ ′ }E 22^{∘}47^{ ′ }24. 16^{ ′ ′ }N)]  
PSNR = 36.13 dB  PSNR = 28.33 dB  PSNR = 26.28 dB  
Zone 3  Center coordinate: [(88^{∘}23^{ ′ }52. 94^{ ′ ′ }E 22^{∘}41^{ ′ }38. 58^{ ′ ′ }N); (88^{∘}27^{ ′ }40. 21^{ ′ ′ }E 22^{∘}44^{ ′ }28. 86^{ ′ ′ }N)]  
PSNR = 34.49 dB  PSNR = 28.20 dB  PSNR = 23.36 dB  
Zone 4  Center coordinate: [(88^{∘}20^{ ′ }53. 51^{ ′ ′ }E 22^{∘}29^{ ′ }43. 04^{ ′ ′ }N); (88^{∘}24^{ ′ }40. 77^{ ′ ′ }E 22^{∘}32^{ ′ }33. 75^{ ′ ′ }N)]  
PSNR = 33.89 dB  PSNR = 29.28 dB  PSNR = 28.38 dB  
Zone 5  Center coordinate: [(88^{∘}24^{ ′ }28. 58^{ ′ ′ }E 22^{∘}24^{ ′ }5. 23^{ ′ ′ }N); (88^{∘}28^{ ′ }15. 37^{ ′ ′ }E 22^{∘}26^{ ′ }55. 87^{ ′ ′ }N)]  
PSNR = 36.07 dB  PSNR = 31.05 dB  PSNR = 28.94 dB 

The basic building blocks of SemK can be customized with respect to any temporal, multivariate meteorological analysis, to achieve better accuracy.

The underlying semantic knowledge of the terrain is significant for different meteorological events (such as analysis of urban heat islands), urban planning, etc. The SemK can be utilized for modeling the temporal dynamism of these events.

Though SemK models the land cover information of the terrain as the semantics of the sample points, it follows a generic framework. The SemK procedure and its semantic metrics can be used for quantifying any new and influencing knowledge and for incorporating them into the spatial analysis to achieve better accuracy.
Future Directions

Extending the SemK method for timeseries prediction and forecasting

To study the interparameter spatiosemantic relationships between the existing parameters in the repository, for predicting the correlated implicit parameters

To model the landatmospheric interaction for predicting the future locations of some geospatial phenomena (e.g., urban heat islands)
CrossReferences
Footnotes
References
 Bhattacharjee S, Dwivedi A, Prasad RR, Ghosh SK (2012) Ontology based spatial clustering framework for implicit knowledge discovery. In: Annual IEEE India conference (INDICON), pp 561–566, KochiGoogle Scholar
 Bhattacharjee S, Ghosh SK (2014) Performance evaluation of semantic kriging: a Euclidean vector analysis approach. IEEE Geosci Remote Sens Lett 12(6): 1185–1189CrossRefGoogle Scholar
 Bhattacharjee S, Ghosh SK (2015) Timeseries augmentation of semantic kriging for the prediction of meteorological parameters. In: IEEE international geoscience and remote sensing symposium (IGARSS), Milan, pp 4562–4565Google Scholar
 Bhattacharjee S, Mitra P, Ghosh SK (2014) Spatial interpolation to predict missing attributes in GIS using semantic kriging. IEEE Trans Geosci Remote Sens 52(8):4771–4780CrossRefGoogle Scholar
 Bhattacharjee S, Prasad RR, Dwivedi A, Dasgupta A, Ghosh SK (2012) Ontology based framework for semantic resolution of geospatial query. In: 12th international conference on intelligent systems design and applications (ISDA), Kochi, pp 437–442Google Scholar
 Brus DJ, de Gruijter JJ, Marsman BA, Visschers BA, Bregt AK, Breeuwsma A (1996) The performance of spatial interpolation methods and Choropleth maps to estimate properties at points: a soil survey case study. Environmetrics 7:1–16CrossRefGoogle Scholar
 Franzen DW, Peck TR (1995) Field soil sampling density for variable rate fertilization. J Prod Agric 8(4): 568–574CrossRefGoogle Scholar
 Gruber TR (1995) Toward principles for the design of ontologies used for knowledge sharing? Int J Hum Comput Stud 43(5):907–928CrossRefGoogle Scholar
 Karydas CG, Gitas IZ, Koutsogiannaki E, LydakisSimantiris N, Silleos G et al (2009) Evaluation of spatial interpolation techniques for mapping agricultural topsoil properties in crete. EARSeL eProceedings 8(1):26–39Google Scholar
 Kravchenko A (2003) Influence of spatial structure on accuracy of interpolation methods. Soil Sci Soc Am 67(5):1564–1571CrossRefGoogle Scholar
 Li J, Heap AD (2008) A review of spatial interpolation methods for environmental scientists. Geoscience, CanberraGoogle Scholar
 Li J, Heap AD (2011) A review of comparative studies of spatial interpolation methods in environmental sciences: performance and impact factors. Ecol Inf 6(3):228–241CrossRefGoogle Scholar
 Mueller T, Pusuluri N, Mathias K, Cornelius P, Barnhisel R, Shearer S (2004) Map quality for ordinary kriging and inverse distance weighted interpolation. Soil Sci Soc Am 68(6):2042–2047CrossRefGoogle Scholar
 Nalder IA, Wein RW (1998) Spatial interpolation of climatic normals: test of a new method in the Canadian boreal forest. Agri For Meteorol 92(4):211–225CrossRefGoogle Scholar
 Ruddick R (2007) Data interpolation methods in the geoscience Australia seascape maps. Geoscience, CanberraGoogle Scholar
 Schloeder C, Zimmerman N, Jacobs M (2001) Comparison of methods for interpolating soil properties using limited data. Soil Sci Soc Am 65(2):470–479CrossRefGoogle Scholar
 Tobler WR (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46:234–240CrossRefGoogle Scholar
 Van Kuilenburg J, De Gruijter JJ, Marsman BA, Bouma J (1982) Accuracy of spatial interpolation between point data on soil moisture supply capacity, compared with estimates from mapping units. Geoderma 27:311–325CrossRefGoogle Scholar
 Weisz R, Fleischer S, Smilowitz Z (1995) Map generation in highvalue horticultural integrated pest management: appropriate interpolation methods for sitespecific pest management of Colorado potato beetle (Coleoptera: Chrysomelidae). J Econ Entomol 88(6):1650–1657CrossRefGoogle Scholar
 Yasrebi J, Saffari M, Fathi H, Karimian N, Moazallahi M, Gazni R et al (2009) Evaluation and comparison of ordinary kriging and inverse distance weighting methods for prediction of spatial variability of some soil chemical parameters. Res J Biol Sci 4(1):93–102Google Scholar