Abstract
Traditional feature-based classification methods require objects to have the explicit, independent, and identifiable set of features, while most geo-referenced objects do not have the explicit features required by classifiers. Therefore, developing classificatory features under geospatial context is a prerequisite for effective spatial classification. Considering the spatial dependency, objects are correlated with each other, and for the object of interest its features (e.g., the distribution of neighboring objects) exist in a wide range of neighboring areas. However, the uncertainty of neighborhood size makes the dimensionality of potential feature set particularly high for spatial classification. Therefore, we propose a new model to automatically select a subset of spatially explicit features through continuous decision making by multiple agents in reinforcement learning (RL). A novel reward mechanism is developed to feed the knowledge of the downstream classification task back to the loop of feature selection. Through extensive experiments with facility points-of-interest datasets, we demonstrate that the subset of classificatory features selected by our RL model can help significantly improve the accuracy of spatial classification. Moreover, our feature selection has potential explainability for the spatial classification rules as it can determine the neighboring areas which have an impact on the classification result.
Similar content being viewed by others
Data availability
The data and codes that support the findings of this study are available in [figshare.com] with the identifiers (https://figshare.com/s/374af036b0e9630c9fa1).
References
Alharbi AN, Dahab M (2020) An improvement in branch and bound algorithm for feature selection. Int J Inf Technol Lang Stud 4(1):1–11
Beales CM et al (2010) Regression analysis of spatial data. Ecol Lett 13(2):246–264
Bibal A et al (2021) Legal requirements on explainability in machine learning. Artif Intell Law 29(2):149–169
Burkart N, Huber MF (2021) A survey on the explainability of supervised machine learning. J Artif Intell Res 70:245–317. https://doi.org/10.1613/jair.1.12228
Cadenasso ML, Pickett STA, Schwarz K (2007) Spatial heterogeneity in urban ecosystems: reconceptualizing land cover and a framework for classification. Front Ecol Environ 5(2):80–88
Comber A et al (2012) Spatial analysis of remote sensing image classification accuracy. Remote Sens Environ 127:237–246. https://doi.org/10.1016/j.rse.2012.09.005
Dodge S, Weibel R, Forootan E (2009) Revealing the physics of movement: comparing the similarity of movement characteristics of different types of moving objects. Comput Environ Urban Syst 33(6):419–434
Ester M, Kriegel HP, Sander J (1997) Spatial data mining: A database approach. In International symposium on spatial databases. Springer, Berlin, Heidelberg, 47–66
Ghaddar B, Naoum-Sawaya J (2018) High dimensional data classification and feature selection using support vector machines. Eur J Oper Res 265(3):993–1004
Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recogn 43(1):5–13
Gopika N, ME AMK (2018) Correlation based feature selection algorithm for machine learning. In 2018 3rd international conference on communication and electronics systems (ICCES) IEEE, 692–695
Huang Y, Pei J, Xiong H (2006) Mining co-location patterns with rare events from spatial data sets. Geoinformatica 10(3):239–260
Janowicz K et al (2020) GeoAI: spatially explicit artificial intelligence techniques for geographic knowledge discovery and beyond. Int J Geogr Inf Sci 34(4):625–636
Jhung Y, Swain PH (1996) Bayesian contextual classification based on modified M-estimates and Markov random fields. IEEE Trans Geosci Remote Sens 34(1):67–75
Jiao R, Nguyen BH, Xue B et al (2023) A survey on evolutionary multiobjective feature selection in classification: approaches, applications, and challenges. IEEE Trans Evol Comput
Jimenez-Rodriguez LO et al (2007) Unsupervised linear feature-extraction methods and their effects in the classification of high-dimensional data. IEEE Trans Geosci Remote Sens 45(2):469–483
Kasetkasem T, Arora MK, Varshney PK (2005) Super-resolution land cover mapping using a Markov random field based approach. Remote Sens Environ 96(3–4):302–314
Kim SK et al (2014) A framework of spatial co-location pattern mining for ubiquitous GIS. Multimedia Tools Appl 71(1):199–218
Kunze L et al (2014) Combining top-down spatial reasoning and bottom-up object class recognition for scene understanding. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems IEEE, 2910–2915. https://doi.org/10.1109/IROS.2014.6942963
LeSage JP (1997) Bayesian estimation of spatial autoregressive models. Int Reg Sci Rev 20(1–2):113–129
Lin Y, Chiang YY, Pan F et al (2017) Mining public datasets for modeling intra-city PM2. 5 concentrations at a fine spatial resolution//Proceedings of the 25th ACM SIGSPATIAL international conference on advances in geographic information systems, 1–10
Lin Y, Chiang YY, Franklin M et al (2020) Building autocorrelation-aware representations for fine-scale spatiotemporal prediction. 2020 IEEE Int Conf Data Min (ICDM) IEEE:352–361
Liu X, Kounadi O, Zurita-Milla R (2022) Incorporating spatial autocorrelation in machine learning models using spatial lag and eigenvector spatial filtering features. ISPRS Int J Geo-Information 11(4):242
Ma W et al (2021) A two-stage hybrid ant colony optimization for high-dimensional feature selection. Pattern Recogn 116:107933
Mai G et al (2022) A review of location encoding for GeoAI: methods and applications. Int J Geogr Inf Sci 36(4):639–673
Mennis J, Guo D (2009) Spatial data mining and geographic knowledge discovery—An introduction. Comput Environ Urban Syst 33(6):403–408
Myles AJ et al (2004) An introduction to decision tree modeling. J Chemometrics: J Chemometrics Soc 18(6):275–285. https://doi.org/10.1002/cem.873
Qi Z, Wang T, Song G et al (2018) Deep air learning: interpolation, prediction, and feature analysis of fine-grained air quality. IEEE Trans Knowl Data Eng 30(12):2285–2297
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Shariat-Mohaymany A, Shahri M, Mirbagheri B et al (2015) Exploring spatial non‐stationarity and varying relationships between crash data and related factors using geographically weighted Poisson regression. Trans GIS 19(2):321–337
Shroff KP, Maheta HH (2015) A comparative study of various feature selection techniques in high-dimensional data set to improve classification accuracy. In 2015 International Conference on Computer Communication and Informatics (ICCCI). IEEE, 1–6. https://doi.org/10.1109/ICCCI.2015.7218098
Sifaou H, Kammoun A, Alouini MS (2020) High-dimensional linear discriminant analysis classifier for spiked covariance model. J Mach Learn Res 21(112):1–24
Silver D, Sutton RS, Müller M (2007) Reinforcement Learning of Local Shape in the Game of Go. In IJCAI, 7: 1053–1058
Soğanlı A, Cetin M (2015) Low-rank sparse matrix decomposition for sparsity-driven SAR image reconstruction. In 2015 3rd International Workshop on Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing (CoSeRa) IEEE, 239–243
Solberg AHS, Taxt T, Jain AK (1996) A Markov random field model for classification of multisource satellite imagery. IEEE Trans Geosci Remote Sens 34(1):100–113
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press
Tobler WR (1979) Cellular geography. Philosophy in geography, vol 20. Springer, Dordrecht, pp 379–386
Veena KM, Manjula SK, Ajitha SKB (2018) Performance comparison of machine learning classification algorithms//Advances in Computing and Data Sciences: Second International Conference, ICACDS 2018, Dehradun, India, April 20–21, 2018, Revised Selected Papers, Part II 2 Springer Singapore, 489–497
Vincent AM, Jidesh P (2023) An improved hyperparameter optimization framework for AutoML systems using evolutionary algorithms. Sci Rep 13(1):4737
Wang X, Shangguan H, Huang F et al (2024) MEL: efficient multi-task evolutionary learning for high-dimensional feature selection. IEEE Trans Knowl Data Eng
Watkins CJCH, Dayan P (1992) Q-learning. Machine learning, 1992, 8(3): 279–292
Wen D et al (2022) Multi-dimensional conditional mutual information with application on the EEG signal analysis for spatial cognitive ability evaluation. Neural Netw 148:23–36. https://doi.org/10.1016/j.neunet.2021.12.010
Xie H, Pierce LE, Ulaby FT (2002) SAR speckle reduction using wavelet denoising and Markov random field modeling. IEEE Trans Geosci Remote Sens 40(10):2196–2212
Yan B et al (2019) A spatially explicit reinforcement learning model for geographic knowledge graph summarization. Trans GIS 23(3):620–640. https://doi.org/10.1111/tgis.12547
Yu W, Chen J, Wei C (2022) A hierarchical learning model for inferring the labels of points of interest with unbalanced data distribution. Int J Appl Earth Obs Geoinf 108:102751
Zhang G, Zhu AX (2020) Sample size and spatial configuration of volunteered geographic information affect effectiveness of spatial bias mitigation. Trans GIS 24(5):1315–1340
Zhang J et al (2018) DEM generation using circular SAR data based on low-rank and sparse matrix decomposition. IEEE Geosci Remote Sens Lett 15(5):724–728
Zhu Y et al (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In 2017 IEEE international conference on robotics and automation (ICRA). IEEE, 3357–3364
Funding
The project was supported by the National Natural Science Foundation of China (42371446 and 42071442) and by the Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) (No.CUG170640). This research was also supported by Meituan.
Author information
Authors and Affiliations
Contributions
Cheng Wei: Methodology, Software, Writing - original draft, Writing—review and editing, Visualization, Data analysis and interpretation, Validation, Investigation. Wenhao Yu: Conceptualization, Software, Writing - original draft, Writing—review and editing, Supervision.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wei, C., Yu, W. A spatial dependency based reinforcement learning model for selecting features in spatial classification. Geoinformatica (2024). https://doi.org/10.1007/s10707-024-00523-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10707-024-00523-x