Abstract
A spatial outlier is defined as an object whose non-spatial attributes are different from the other objects in its spatial neighborhood. In this paper, a new geographically weighted method which is based on the Comedian approach for the detection of spatial outliers in a multivariate structure, called the geographically weighted comedian (GWCOM) method, is developed and discussed. A simulation study is carried out to assess and compare its performance with the existing geographically weighted methods (geographically weighted Mahalanobis distance and geographically weighted principal component analysis methods). Also, a real-life data application of the GWCOM method on a Water Quality dataset is discussed to demonstrate its effectiveness. The proposed GWCOM method shows very promising results both in simulation study as well as in application to the real data and outperforms the existing methods.
Similar content being viewed by others
References
Aggarwal, C. C. (2017). An introduction to outlier analysis. In: Outlier analysis. Springer, p 1–34.
Anselin, L. (1994). Exploratory spatial data analysis and geographic information systems. New Tools for Spatial Analysis, 17, 45–54.
Anselin, L. (1995). Local indicators of spatial association-lisa. Geographical Analysis, 27(2), 93–115.
Barnett, V., & Lewis, T. (1984). Outliers in statistical data. OSD.
Basu, A., Harris, I. R., Hjort, N. L., et al. (1998). Robust and efficient estimation by minimising a density power divergence. Biometrika, 85(3), 549–559.
Brunsdon, C., Fotheringham, A. S., & Charlton, M. E. (1996). Geographically weighted regression: A method for exploring spatial nonstationarity. Geographical Analysis, 28(4), 281–298.
Brunsdon, C., Fotheringham, A. S., & Charlton, M. (1998). Spatial nonstationarity and autoregressive models. Environment and Planning A, 30(6), 957–973.
Brunsdon, C., Fotheringham, A., & Charlton, M. (2002). Geographically weighted summary statistics-a framework for localised exploratory data analysis. Computers, Environment and Urban Systems, 26(6), 501–524.
Cabana, E., Lillo, R.E., & Laniado, H. (2019). Multivariate outlier detection based on a robust mahalanobis distance with shrinkage estimators. Statistical Papers, 1–27.
Chawla, S., & Sun, P. (2006). Slom: A new measure for local spatial outliers. Knowledge and Information Systems, 9(4), 412–429.
Chen, D., Lu, C. T., Kou, Y., et al. (2008). On detecting spatial outliers. Geoinformatica, 12(4), 455–475.
Di Palma, M., & Gallo, M. (2016). A co-median approach to detect compositional outliers. Journal of Applied Statistics, 43(13), 2348–2362.
Dykes, J., & Brunsdon, C. (2007). Geographically weighted visualization: Interactive graphics for scale-varying exploratory analysis. IEEE Transactions on Visualization and Computer Graphics, 13(6), 1161–1168.
Falk, M. (1997). On mad and comedians. Annals of the Institute of Statistical Mathematics, 49(4), 615–644.
Foley, P., & Demšar, U. (2013). Using geovisual analytics to compare the performance of geographically weighted discriminant analysis versus its global counterpart, linear discriminant analysis. International Journal of Geographical Information Science, 27(4), 633–661.
Grekousis, G. (2021). Local fuzzy geographically weighted clustering: A new method for geodemographic segmentation. International Journal of Geographical Information Science, 35(1), 152–174.
Grubbs, F. E. (1969). Procedures for detecting outlying observations in samples. Technometrics, 11(1), 1–21.
Hagenauer, J., & Helbich, M. (2021). A geographically weighted artificial neural network. International Journal of Geographical Information Science, 1–21.
Handcock, M. S., & Wallis, J. R. (1994). An approach to statistical spatial-temporal modeling of meteorological fields. Journal of the American Statistical Association, 89(426), 368–378.
Harris, P., Brunsdon, C., & Charlton, M. (2011). Geographically weighted principal components analysis. International Journal of Geographical Information Science, 25(10), 1717–1736.
Harris, P., Brunsdon, C., Charlton, M., et al. (2014). Multivariate spatial outlier detection using robust geographically weighted methods. Mathematical Geosciences, 46(1), 1–31.
Hawkins, D. M. (1980). Identification of outliers, (Vol. 11). Springer.
Jones, M., Hjort, N. L., Harris, I. R., et al. (2001). A comparison of related density-based minimum divergence estimators. Biometrika, 88(3), 865–873.
Li, L. (2019). Geographically weighted machine learning and downscaling for high-resolution spatiotemporal estimations of wind speed. Remote Sensing, 11(11), 1378.
Lu, C., Chen, D., & Kou, Y. (2003). Algorithms for spatial outlier detection. In: Third IEEE International Conference on Data Mining, 597–600.
Lu, C. T., Chen, D., & Kou, Y. (2004). Multivariate spatial outlier detection. International Journal on Artificial Intelligence Tools, 13(04), 801–811.
Lu, B., Harris, P., Gollini, I., et al. (2013). Gwmodel: An r package for exploring spatial heterogeneity. GISRUK, 2013, 3–5.
Maronna, R. A. (1976). Robust m-estimators of multivariate location and scatter. The Annals of Statistics, 51–67.
Maronna, R. A., & Zamar, R. H. (2002). Robust estimates of location and dispersion for high-dimensional datasets. Technometrics, 44(4), 307–317.
Mason, G., & Jacobson, R. (2007). Fuzzy geographically weighted clustering. In: Proceedings of the 9th International Conference on Geocomputation, 1998, 1–7.
Murakami, D., Tsutsumida, N., Yoshida, T., et al. (2020). Scalable gwr: A linear-time algorithm for large-scale geographically weighted regression with polynomial kernels. Annals of the American Association of Geographers, 1–22.
Nakaya, T., Fotheringham, A. S., Brunsdon, C., et al. (2005). Geographically weighted poisson regression for disease association mapping. Statistics in Medicine, 24(17), 2695–2717.
Pebesma, E., Graeler, B., & Pebesma, M. E. (2015). Package ‘gstat’. Comprehensive R Archive Network (CRAN), 1–0.
Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American statistical association, 79(388), 871–880.
Rousseeuw, P. J. (1985). Multivariate estimation with high breakdown point. Mathematical statistics and applications, 8(283–297), 37.
Rousseeuw, P. J., & Driessen, K. V. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41(3), 212–223.
Rousseeuw, P. J., & Leroy, A. M. (2005). Robust regression and outlier detection (Vol. 589). John wiley & sons.
Sajesh, T., & Srinivasan, M. (2012). Outlier detection for high dimensional data using the comedian approach. Journal of statistical computation and simulation, 82(5), 745–757.
Shekhar, S., & Chawla, S. (2003). A tour of spatial databases. Prentice Hall Upper Saddle River.
Shekhar, S., Lu, C. T., & Zhang, P. (2001). Detecting graph-based spatial outliers: Algorithms and applications (a summary of results). In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp 371–376.
Shekhar, S., Lu, C. T., & Zhang, P. (2003). A unified approach to detecting spatial outliers. GeoInformatica, 7(2), 139–166.
Shukla, S., & Lalitha, S. (2021). Robust outlier detection method for multivariate spatial data. National Academy Science Letters, 1–4.
Sugasawa, S., & Murakami, D. (2022). Adaptively robust geographically weighted regression. Spatial Statistics, 48(100), 623.
Tasyurek, M., & Celik, M. (2020). Rnn-gwr: A geographically weighted regression approach for frequently updated data. Neurocomputing, 399, 258–270.
Team, R. C. (2019) R: A language and environment for statistical computing [computer software, version 3.6. 2].
Velasco, H., Laniado, H., Toro, M., et al. (2020). Robust three-step regression based on comedian and its performance in cell-wise and case-wise outliers. Mathematics, 8(8), 1259.
Waller, L. A., & Gotway, C. A. (2004). Applied spatial statistics for public health data, (Vol. 368). John Wiley & Sons.
Xue, Y., Schifano, E. D., & Hu, G. (2020). Geographically weighted cox regression for prostate cancer survival data in louisiana. Geographical Analysis, 52(4), 570–587.
Acknowledgements
The authors would like to thank the anonymous reviewers for their detailed comments and valuable suggestions which have substantially improved the paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shukla, S., Lalitha, S. Geographically Weighted Comedian method for spatial outlier detection. Jpn J Stat Data Sci 6, 279–299 (2023). https://doi.org/10.1007/s42081-023-00202-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-023-00202-5