Skip to main content
Log in

Geographically Weighted Comedian method for spatial outlier detection

  • Original Paper
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

A spatial outlier is defined as an object whose non-spatial attributes are different from the other objects in its spatial neighborhood. In this paper, a new geographically weighted method which is based on the Comedian approach for the detection of spatial outliers in a multivariate structure, called the geographically weighted comedian (GWCOM) method, is developed and discussed. A simulation study is carried out to assess and compare its performance with the existing geographically weighted methods (geographically weighted Mahalanobis distance and geographically weighted principal component analysis methods). Also, a real-life data application of the GWCOM method on a Water Quality dataset is discussed to demonstrate its effectiveness. The proposed GWCOM method shows very promising results both in simulation study as well as in application to the real data and outperforms the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Aggarwal, C. C. (2017). An introduction to outlier analysis. In: Outlier analysis. Springer, p 1–34.

  • Anselin, L. (1994). Exploratory spatial data analysis and geographic information systems. New Tools for Spatial Analysis, 17, 45–54.

    Google Scholar 

  • Anselin, L. (1995). Local indicators of spatial association-lisa. Geographical Analysis, 27(2), 93–115.

    Article  Google Scholar 

  • Barnett, V., & Lewis, T. (1984). Outliers in statistical data. OSD.

  • Basu, A., Harris, I. R., Hjort, N. L., et al. (1998). Robust and efficient estimation by minimising a density power divergence. Biometrika, 85(3), 549–559.

    Article  MathSciNet  MATH  Google Scholar 

  • Brunsdon, C., Fotheringham, A. S., & Charlton, M. E. (1996). Geographically weighted regression: A method for exploring spatial nonstationarity. Geographical Analysis, 28(4), 281–298.

    Article  Google Scholar 

  • Brunsdon, C., Fotheringham, A. S., & Charlton, M. (1998). Spatial nonstationarity and autoregressive models. Environment and Planning A, 30(6), 957–973.

    Article  Google Scholar 

  • Brunsdon, C., Fotheringham, A., & Charlton, M. (2002). Geographically weighted summary statistics-a framework for localised exploratory data analysis. Computers, Environment and Urban Systems, 26(6), 501–524.

    Article  MATH  Google Scholar 

  • Cabana, E., Lillo, R.E., & Laniado, H. (2019). Multivariate outlier detection based on a robust mahalanobis distance with shrinkage estimators. Statistical Papers, 1–27.

  • Chawla, S., & Sun, P. (2006). Slom: A new measure for local spatial outliers. Knowledge and Information Systems, 9(4), 412–429.

    Article  Google Scholar 

  • Chen, D., Lu, C. T., Kou, Y., et al. (2008). On detecting spatial outliers. Geoinformatica, 12(4), 455–475.

    Article  Google Scholar 

  • Di Palma, M., & Gallo, M. (2016). A co-median approach to detect compositional outliers. Journal of Applied Statistics, 43(13), 2348–2362.

    Article  MathSciNet  MATH  Google Scholar 

  • Dykes, J., & Brunsdon, C. (2007). Geographically weighted visualization: Interactive graphics for scale-varying exploratory analysis. IEEE Transactions on Visualization and Computer Graphics, 13(6), 1161–1168.

    Article  Google Scholar 

  • Falk, M. (1997). On mad and comedians. Annals of the Institute of Statistical Mathematics, 49(4), 615–644.

    Article  MathSciNet  MATH  Google Scholar 

  • Foley, P., & DemÅ¡ar, U. (2013). Using geovisual analytics to compare the performance of geographically weighted discriminant analysis versus its global counterpart, linear discriminant analysis. International Journal of Geographical Information Science, 27(4), 633–661.

    Article  Google Scholar 

  • Grekousis, G. (2021). Local fuzzy geographically weighted clustering: A new method for geodemographic segmentation. International Journal of Geographical Information Science, 35(1), 152–174.

    Article  Google Scholar 

  • Grubbs, F. E. (1969). Procedures for detecting outlying observations in samples. Technometrics, 11(1), 1–21.

    Article  Google Scholar 

  • Hagenauer, J., & Helbich, M. (2021). A geographically weighted artificial neural network. International Journal of Geographical Information Science, 1–21.

  • Handcock, M. S., & Wallis, J. R. (1994). An approach to statistical spatial-temporal modeling of meteorological fields. Journal of the American Statistical Association, 89(426), 368–378.

    Article  MathSciNet  MATH  Google Scholar 

  • Harris, P., Brunsdon, C., & Charlton, M. (2011). Geographically weighted principal components analysis. International Journal of Geographical Information Science, 25(10), 1717–1736.

    Article  Google Scholar 

  • Harris, P., Brunsdon, C., Charlton, M., et al. (2014). Multivariate spatial outlier detection using robust geographically weighted methods. Mathematical Geosciences, 46(1), 1–31.

    Article  MathSciNet  MATH  Google Scholar 

  • Hawkins, D. M. (1980). Identification of outliers, (Vol. 11). Springer.

  • Jones, M., Hjort, N. L., Harris, I. R., et al. (2001). A comparison of related density-based minimum divergence estimators. Biometrika, 88(3), 865–873.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, L. (2019). Geographically weighted machine learning and downscaling for high-resolution spatiotemporal estimations of wind speed. Remote Sensing, 11(11), 1378.

    Article  Google Scholar 

  • Lu, C., Chen, D., & Kou, Y. (2003). Algorithms for spatial outlier detection. In: Third IEEE International Conference on Data Mining, 597–600.

  • Lu, C. T., Chen, D., & Kou, Y. (2004). Multivariate spatial outlier detection. International Journal on Artificial Intelligence Tools, 13(04), 801–811.

    Article  Google Scholar 

  • Lu, B., Harris, P., Gollini, I., et al. (2013). Gwmodel: An r package for exploring spatial heterogeneity. GISRUK, 2013, 3–5.

    Google Scholar 

  • Maronna, R. A. (1976). Robust m-estimators of multivariate location and scatter. The Annals of Statistics, 51–67.

  • Maronna, R. A., & Zamar, R. H. (2002). Robust estimates of location and dispersion for high-dimensional datasets. Technometrics, 44(4), 307–317.

    Article  MathSciNet  Google Scholar 

  • Mason, G., & Jacobson, R. (2007). Fuzzy geographically weighted clustering. In: Proceedings of the 9th International Conference on Geocomputation, 1998, 1–7.

  • Murakami, D., Tsutsumida, N., Yoshida, T., et al. (2020). Scalable gwr: A linear-time algorithm for large-scale geographically weighted regression with polynomial kernels. Annals of the American Association of Geographers, 1–22.

  • Nakaya, T., Fotheringham, A. S., Brunsdon, C., et al. (2005). Geographically weighted poisson regression for disease association mapping. Statistics in Medicine, 24(17), 2695–2717.

    Article  MathSciNet  Google Scholar 

  • Pebesma, E., Graeler, B., & Pebesma, M. E. (2015). Package ‘gstat’. Comprehensive R Archive Network (CRAN), 1–0.

  • Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American statistical association, 79(388), 871–880.

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseeuw, P. J. (1985). Multivariate estimation with high breakdown point. Mathematical statistics and applications, 8(283–297), 37.

    MathSciNet  MATH  Google Scholar 

  • Rousseeuw, P. J., & Driessen, K. V. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41(3), 212–223.

    Article  Google Scholar 

  • Rousseeuw, P. J., & Leroy, A. M. (2005). Robust regression and outlier detection (Vol. 589). John wiley & sons.

  • Sajesh, T., & Srinivasan, M. (2012). Outlier detection for high dimensional data using the comedian approach. Journal of statistical computation and simulation, 82(5), 745–757.

    Article  MathSciNet  MATH  Google Scholar 

  • Shekhar, S., & Chawla, S. (2003). A tour of spatial databases. Prentice Hall Upper Saddle River.

  • Shekhar, S., Lu, C. T., & Zhang, P. (2001). Detecting graph-based spatial outliers: Algorithms and applications (a summary of results). In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp 371–376.

  • Shekhar, S., Lu, C. T., & Zhang, P. (2003). A unified approach to detecting spatial outliers. GeoInformatica, 7(2), 139–166.

    Article  Google Scholar 

  • Shukla, S., & Lalitha, S. (2021). Robust outlier detection method for multivariate spatial data. National Academy Science Letters, 1–4.

  • Sugasawa, S., & Murakami, D. (2022). Adaptively robust geographically weighted regression. Spatial Statistics, 48(100), 623.

    MathSciNet  Google Scholar 

  • Tasyurek, M., & Celik, M. (2020). Rnn-gwr: A geographically weighted regression approach for frequently updated data. Neurocomputing, 399, 258–270.

    Article  Google Scholar 

  • Team, R. C. (2019) R: A language and environment for statistical computing [computer software, version 3.6. 2].

  • Velasco, H., Laniado, H., Toro, M., et al. (2020). Robust three-step regression based on comedian and its performance in cell-wise and case-wise outliers. Mathematics, 8(8), 1259.

    Article  Google Scholar 

  • Waller, L. A., & Gotway, C. A. (2004). Applied spatial statistics for public health data, (Vol. 368). John Wiley & Sons.

  • Xue, Y., Schifano, E. D., & Hu, G. (2020). Geographically weighted cox regression for prostate cancer survival data in louisiana. Geographical Analysis, 52(4), 570–587.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their detailed comments and valuable suggestions which have substantially improved the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sweta Shukla.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shukla, S., Lalitha, S. Geographically Weighted Comedian method for spatial outlier detection. Jpn J Stat Data Sci 6, 279–299 (2023). https://doi.org/10.1007/s42081-023-00202-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42081-023-00202-5

Keywords

Navigation