Abstract
Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outlier. In this work, an effective outlier detection method based on multi-dimensional clustering and local density (ODBMCLD) is proposed. ODBMCLD firstly identifies the center objects by the local density peak of data objects, and clusters the whole dataset based on the center objects. Then, outlier objects belonging to different clusters will be marked as candidates of abnormal data. Finally, the top N points among these abnormal candidates are chosen as final anomaly objects with high outlier factors. The feasibility and effectiveness of the method are verified by experiments.
Similar content being viewed by others
References
KAUR H, SINGH G, MINHAS J. A review of machine learning based anomaly detection techniques [J]. International Journal of Computer Applications Technology and Research, 2013, 2(2): 185–187.
SINGH K, UPADHYAYA D S. Outlier detection: Applications and techniques [J]. International Journal of Computer Science Issues, 2012, 9(1): 307–323.
YE H, KITAGAWA H, XIAO J. Continuous angle-based outlier detection on high-dimensional data streams [C]// Proceedings of the 19th International Database Engineering & Applications Symposium. Yokohama, Japan, 2015: 162–167.
PANDA M, PATRA M R. A novel classification via clustering method for anomaly based network intrusion detection system [J]. International Journal of Recent Trends in Engineering, 2009, 2(1): 211–232.
SHUCHITA D, SINGH K. Classification based outlier detection techniques [J]. International Journal of Computer Trends and Technology, 2012, 3(2): 294–298.
KURIAN M J, GLADSTON R S. An analysis on the performance of a classification based outlier detection system using feature selection [J]. International Journal of Computer Applications, 2015, 132(8): 15–21.
SUGIYAMA M, BORGWARDT K M. Rapid distance-based outlier detection via sampling [C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, 2013: 467–475.
CAMPELLO R J G B, MOULAVI D, SANDER A J. Hierarchical density estimates for data clustering, visualization, and outlier detection [J]. ACM Transactions on Knowledge Discovery from Data (TKDD), 2015, 10(1): 1–5.
GOSAVI J S, WADNE V S. Unsupervised distance-based outlier detection using nearest neighbours algorithm on distributed approach: Survey [J]. International Journal of Innovative Research in Computer and Communication Engineering, 2014, 2(12): 7510–7514.
PAVANI D, RAJENDRAPRASAD K. The competent reverse nearest neighbors for outlier detection in high dimensional data [J]. International Journal of Computer Science and Technology, 2016, 7(1): 70–73.
UPADHYAYA D S, SINGH K. Nearest neighbour based outlier detection techniques [J]. International Journal of Computer Trends and Technology, 2012, 31(2): 299–303.
WANG Wei, ZHANG Bao-ju, WANG Dan, JIANG Yu, QIN Shan, XUE Lei. Anomaly detection based on probability density function with Kullback–Leibler divergence [J]. Signal Processing, 2016, 126: 12–17.
BHATT V, DHAKAR M, CHAURASIA B K. Filtered clustering based on local outlier factor in data mining [J]. International Journal of Database Theory and Application, 2016, 9(5): 275–282.
BAI M, WANG X T, XIN J C, WANG G. An efficient algorithm for distributed density-based outlier detection on big data [J]. Neurocomputing, 2016, 181(C): 19–28.
MENG Jiang-liang, SHANG Hai-kun, BIAN Ling. The application on intrusion detection based on k-means cluster algorithm [C]// International Forum on Information Technology and Applications. Chengdu: IEEE, 2009: 150–152.
LEE M H, WEI H, LEE S H, LEE S M, SHIN S S. Design of similarity measure for discrete data and application to multi-dimension [J]. Journal of Central South University, 2013, 20(4): 982–987.
RANJAN R, SAHOO G. A new clustering approach for anomaly intrusion detection [J]. International Journal of Data Mining & Knowledge Management Process, 2014, 4(2): 29–38.
ESTER M, KRIEGEL H, SANDER J, XU X. A density-based algorithm for discovering clusters in large spatial databases with noise [C]// Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. Portland, 1996: 226–231.
LU Zhi-mao, LIU Chen, ZHANG Chun-xiang, WANG Lei. Clustering method based on data division and partition [J]. Journal of Central South University, 2014, 21(1): 213–222.
DENG Ze, HU Yang-yang, ZHU Mao, HUANG Xiao-hui, DU Bo. A scalable and fast OPTICS for clustering trajectory big data [J]. Cluster Computing, 2015, 18(2): 549–562.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Project(61362021) supported by the National Natural Science Foundation of China; Project(2016GXNSFAA380149) supported by Natural Science Foundation of Guangxi Province, China; Projects(2016YJCXB02, 2017YJCX34) supported by Innovation Project of GUET Graduate Education, China; Project(2011KF11) supported by the Key Laboratory of Cognitive Radio and Information Processing, Ministry of Education, China
Rights and permissions
About this article
Cite this article
Shou, Zy., Li, My. & Li, Sm. Outlier detection based on multi-dimensional clustering and local density. J. Cent. South Univ. 24, 1299–1306 (2017). https://doi.org/10.1007/s11771-017-3535-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11771-017-3535-4