Skip to main content
Log in

A neighborhood weighted-based method for the detection of outliers

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Outlierdetection is an important research direction in data mining, including fraud detection, activity monitoring, medical research, network intrusion detection, etc. Many outlier detection methods have been proposed; however, most of them are not suitable for complex patterns because they do not extract appropriate neighbor information and cannot estimate the density accurately. Additionally, their performance is not stable and depends heavily on the number of nearest neighbors (k) selected. To overcome the above defects, we proposed a neighborhood weighted-based outlier detection(NWOD) algorithm that can obtain correct detection result in a variety of situations. In our algorithm, the local density of an object is measured by constructing a weighted nearest neighbor graph and quantifying how difficult it is for the object and its nearest neighbors to reach each other. Furthermore, the proposed neighborhood weighted local outlier factor (NWLOF) compares the differences of the neighborhood weighted local density between a given object and the objects in its neighborhood, and then the degree of being an outlier of an object can be judged. The larger the NWLOF of an object is, the more likely it is to be an outlier. In addition, due to our proposed algorithm being based on the concept of a natural stable structure, its performance does not rely on the value of k. Experiments conducted on both synthetic and real-world datasets show the superiority of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Zhang W H (2017) An anomaly detection method for medicare fraud detection. 2017 IEEE International Conference on Big Knowledge (ICBK), pp 309–314

  2. Evangelou M, Adams NM (2020) An anomaly detection framework for cyber-security data. Comput Secur 97:101941

    Article  Google Scholar 

  3. Smiti A (2020) A critical overview of outlier detection methods. Comput Sci Rev 38:100306

    Article  MathSciNet  MATH  Google Scholar 

  4. da Costa KAP, Papa JP, Passos LA, Colombo D, Del Ser J, Muhammad K, de Albuquerque VHC (2020) A critical literature survey and prospects on tampering and anomaly detection in image data. Appl Soft Comput 97:106727

    Article  Google Scholar 

  5. Domingues R, Filippone M, Michiardi P, Zouaoui J (2018) A comparative evaluation of outlier detection algorithms: experiments and analyses. Pattern Recogn 74:406–421

    Article  MATH  Google Scholar 

  6. Wangm X, Wang X, Wilkes M (2021) New developments in unsupervised outlier detection. Springer Singapore

  7. Meng F, Yuan G, Lv S, Wang Z, Xia S (2019) An overview on trajectory outlier detection. Artif Intell Rev 52:2347–2456

    Article  Google Scholar 

  8. Goldstein M, Uchida S (2016) A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data. PLoS One

  9. Campos GO, Zimek A (2016) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Discov 30:891–927

    Article  MathSciNet  Google Scholar 

  10. Ozkan H, Ozkan F, Kozat SS (2016) Online anomaly detection under markov statistics with controllable type-i error. IEEE Trans Signal Process 64(6):1435–1445

    Article  MathSciNet  MATH  Google Scholar 

  11. Ding J, Wang J, Zhang Y, Li Y, Zheng N (2021) Correlation-based robust linear regression with iterative outlier removal. In: ICASSP 2021 - 2021 IEEE international conference on acoustics, speech and signal processing (ICASSP)

  12. Yuen K-V, Ortiz GA (2017) Outlier detection and robust regression for correlated data. Comput Methods Appl Mech Eng 313:632–646

    Article  MathSciNet  MATH  Google Scholar 

  13. Wang B, Mao Z (2019) Outlier detection based on Gaussian process with application to industrial processes. Appl Soft Comput 76:505–516

    Article  Google Scholar 

  14. Huang J, Zhu Q, Yang L, Cheng D, Wu Q (2017) A novel outlier cluster detection algorithm without top-n parameter. Knowl-based Syst 121:32–40

    Article  Google Scholar 

  15. Jones PJ, James MK, Davies MJ, Khunti K, Catt M, Yates T, Rowlands AV, Mirkes EM (2020) FilterK: A new outlier detection method for k-means clustering of physical activity. J Biomed Inf 104:103397

    Article  Google Scholar 

  16. Tu B, Yang X, Li N, Zhou C, He D (2020) Hyperspectral anomaly detection via density peak clustering. Pattern Recogn Lett 129:144–149

    Article  Google Scholar 

  17. Chen J, Sadeqi E, Zhang Q (2018) A practical algorithm for distributed clustering and outlier detection. arXiv preprint

  18. Gao J, Ji W, Zhang L, Li A, Wang Y, Zhang Z (2020) Cube-based incremental outlier detection for streaming computing. Inf Sci 517:361–376

    Article  Google Scholar 

  19. Ha J, Seok S, Lee JS (2014) Robust outlier detection using the instability factor. Knowl-based Syst 63:15–23

    Article  Google Scholar 

  20. Zhang K, Hutter M, Jin HD (2009) A new local distance-based outlier detection approach for scattered real-world data. Adv Knowl Discov Data Min 5476:813–822

    Article  Google Scholar 

  21. Knorr E M, Ng R T (1998) Algorithms for mining distance-based outliers in large datasets, pp 392–403

  22. Tran L F (2016) Distance-Based Outlier Detection in Data Streams. Proc VLDB Endow 9 (12):1089–1100

    Article  Google Scholar 

  23. Tran L M (2020) Real-Time Distance-Based Outlier Detection in Data Streams. Proc VLDB Endow 14(2):141–153

    Article  Google Scholar 

  24. Breunig M M, Kriegel H-P, Ng R T, Sander J (2000) Lof: Identifying density-based local outliers. SIGMOD Rec 29(2):93–104

    Article  Google Scholar 

  25. Tang J, Chen Z, Fu A W-C, Cheung D W-L (2002) Enhancing effectiveness of outlier detections for low density patterns, pp 535–548

  26. Tang B, He H (2017) A local density-based approach for outlier detection. Neurocomputing 241:171–180

    Article  Google Scholar 

  27. Uttarkabat S, Sunkara N D, Patra B K (2020) Rsod: Efficient technique for outlier detection using reverse nearest neighbors statistics. In: 2020 4th international conference on computational intelligence and networks (CINE), pp 1–6

  28. Xie J, Xiong ZY, Dai QZ, Wang XX, Zhang YF (2020) A local-gravitation-based method for the detection of outliers and boundary points. Knowl-based Syst 192

  29. Zhu QS, Feng J, Huang JL (2016) Natural neighbor: A self-adaptive neighborhood method without parameter k. Pattern Recogn Lett 80:30–36

    Article  Google Scholar 

  30. Huang JL, Zhu QS, Feng J (2016) A non-parameter outlier detection algorithm based on natural neighbor. Knowl-based Syst 92:71–77

    Article  Google Scholar 

  31. Yang LJ, Zhu QS, JL.Huang, Cheng DD (2017) Adaptive edited natural neighbor algorithm. Neurocomputing 230:427– 433

    Article  Google Scholar 

  32. Wahid A, Sekhara C, Annavarapu R (2021) NaNOD: A natural neighbour-based outlier detection algorithm. Neural Comput Appl 33:2107–2123

    Article  Google Scholar 

  33. Bentley J L (1975) Multidimensional binary search trees used for associative searching. Assoc Comput Machinery 18(9):509– 517

    Article  MATH  Google Scholar 

  34. Sadeghi R, Banerjee T, Romine W (2018) Early hospital mortality prediction using vital signals. Smart Health 9-10:265–274

    Article  Google Scholar 

  35. Li L-T, Xiong Z-Y, Dai Q-Z, Zha Y-F, Zhang Y-F, Dan J-P (2020) A novel graph-based clustering method using noise cutting. Inf Syst 91:101504

    Article  Google Scholar 

  36. Papadimitriou C, Steiglitz K (1998) Combinatorial optimization:algorithms and complexity. Courier Dover Publications

  37. Wang C, Liu Z, Gao H, Fu Y (2019) Vos: A new outlier detection model using virtual graph. Knowl-based Syst 185

  38. Lichman M Uci machine learning repository. http://archive.ics.uci.edu/ml

Download references

Acknowledgments

The authors would like to thank the editor and anonymous reviewers for their valuable comments and suggestions. This work is funded by the National Natural Science Foundation of China (no. 61701051), the Fundamental Research Funds for the Central Universities (no. 2019CDCGJSJ329) and the Graduate Scientific Research and Innovation Foundation of Chongqing, China (Grant no. CYS20067).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhong-Yang Xiong.

Ethics declarations

Conflict of Interest Statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiong, ZY., Long, H., Zhang, YF. et al. A neighborhood weighted-based method for the detection of outliers. Appl Intell 53, 9897–9915 (2023). https://doi.org/10.1007/s10489-022-03258-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03258-0

Keywords

Navigation