A pattern-based outlier region detection method for two-dimensional arrays
- 35 Downloads
Recently, with the prevalence of various sensing devices and numerical simulation software, a large amount of data is being generated in the form of a two-dimensional (2D) array. One of the important tasks for analyzing such arrays is to find anomalous or outlier regions in such a 2D array. In this article, we propose an effective method for detecting outlier regions in an arbitrary 2D array, which show a significantly different pattern from that of their surrounding regions. Unlike most existing methods that determine the outlierness of a region based on how different its average is from that of its neighboring elements, our method exploits the regression models of a region in determining its outlierness. More specifically, this method first divides the array into a number of small subarrays and then builds a regression model for each subarray. In turn, the method iteratively merges adjacent subarrays with similar regression models into larger clusters. After the clustering, the proposed method reports very small clusters as outlier regions at the final step. Lastly, we demonstrate in our experiments the effectiveness of the proposed method on synthetic and real datasets.
KeywordsOutlier detection Outlier region Two-Dimensional array
We would like to thank anonymous reviewers for their insightful comments to improve the quality of this article. We also give thanks to Sang-Un Gu for locating and preparing for the real data sets.
- 1.Amidan BG, Ferryman TA, Cooley SK (2005) Data outlier detection using the Chebyshev theorem. In: Proceedings of 2005 IEEE Aerospace Conference. IEEE, pp 3814–3819Google Scholar
- 3.Chawla M, Sharma S, Sivaswamy J, Kishore LA (2009) Method for automatic detection and classification of stroke from brain CT images. In: Proceedings of the 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, pp 3581–3584Google Scholar
- 6.Franke C, Gertz M (2009) ORDEN: outlier region detection and exploration in sensor networks. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, pp 1075–1078Google Scholar
- 12.Kutner MH, Nachtsheim CJ, Neter J, Li W (2005) Applied linear statistical models, 5th edn. McGraw Hill, New YorkGoogle Scholar
- 19.NASA (2010): New map offers a global view of health-sapping air pollution. https://www.nasa.gov/topics/earth/features/health-sapping.html. Accessed 29 Jan 2018
- 20.NASA (2018): NASA earth observation. https://neo.sci.gsfc.nasa.gov/. Accessed 29 Jan 2018
- 21.Neill DB, Moore AW, Cooper GF (2006) A Bayesian spatial scan statistic. In: Weiss Y, Schölkopf B, Platt JC (eds) Advances in Neural Information Processing Systems 18 (NIPS 2005). Neural Information Processing Systems Foundation, Inc, pp 1003–1010Google Scholar
- 24.Ramteke R, Monali YK (2012) Automatic Medical image classification and abnormality detection using K-nearest neighbour. Int J Adv Comput Res 2(4):190–196Google Scholar
- 32.You C, Robinson DP, Vidal R (2017) Provable self-representation based outlier detection in a union of subspaces. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp 1–10Google Scholar
- 33.Zheng G, Brantley SL, Lauvaux T, Li Z (2017) Contextual spatial outlier detection with metric learning. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 2161–2170Google Scholar
- 34.Zhu M, Aggarwal CC, Ma S, Zhang H, Huai J (2017) Outlier detection in sparse data with factorization machines. In: Proceedings of the 2017 ACM Conference on Information and Knowledge Management, pp 817–826Google Scholar