Advertisement

A Survey on Outlier Detection in the Context of Stream Mining: Review of Existing Approaches and Recommadations

  • Imen SouidenEmail author
  • Zaki Brahmi
  • Hajer Toumi
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 557)

Abstract

Generally, extracting only expected knowledge from data is not sufficient since unexpected ones can hide useful information concerning the data behavior. These information can be further used to optimize the current state. This has lead to the outlier detection. It refers to the data mining task that aims to find abnormal points or sequence of data hidden in the dataset. In fact, due to the emergence of new technologies, applications often generate and consume data in form of streams. This data differs from the static one. Therefore, traditional techniques cannot be used. Hence, convenient ones suitable to the data stream nature must be applied. In this paper, we will review different techniques of outlier detection in the data streams. In addition, we shall describe different approaches based on these techniques in order to establish a comparative study based on different criterion. This study aims to help users and facilitates the choice of the appropriate algorithm for a certain context.

Keywords

Data stream mining Outlier detection Data stream 

References

  1. 1.
    Aggarwal, C.C.: Data Mining: The Textbook. Springer, Switzerland (2015)CrossRefzbMATHGoogle Scholar
  2. 2.
    Karimian, S.H., Kelarestaghi, M., Hashemi, S.: I-inclof: improved incremental local outlier detection for data streams. In: 16th CSI International Symposium on Artificial Intelligence and Signal Processing (2012)Google Scholar
  3. 3.
    Beigi, M.S., Ebadollahi, S., Chang, S.F., Verma, D.C.: Anomaly detection in information streams without prior domain knowledge. IBM J. Res. Dev. 55, 550–560 (2011)CrossRefGoogle Scholar
  4. 4.
    Thakkar, P., Vala, J., Prajapati, V.: Survey on outlier detection in data stream. Int. J. Comput. Appl. 136, 13–16 (2016)Google Scholar
  5. 5.
    Sadik, M.S., Gruenwald, L.: Research issues in outlier detection for data streams. ACM SIGKDD Explor. 15, 33–40 (2014)CrossRefGoogle Scholar
  6. 6.
    Stevanovic, D., Vlajic, N.: Next generation application-layer DDoS defences: applying the concepts of outlier detection in data streams with concept drift. In: 2014 13th International Conference on Machine Learning and Applications (2014)Google Scholar
  7. 7.
    Miller, Z., Deitrick, W., Hu, W.: Anomalous network packet detection using data stream mining. J. Inf. Secur. 2, 158–168 (2011)Google Scholar
  8. 8.
    Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J., Rundensteiner, E.A.: Scalable distance-based outlier detection over high-volume data streams. In: 30th International Conference on Data Engineering (2014)Google Scholar
  9. 9.
    Angiulli, F., Fassetti, F.: Distance-based outlier queries in data streams: the novel task and algorithms. Data Min. Knowl. Disc. 20(2), 290–324 (2010)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Yang, D., Rundensteiner, E., Ward, M.: Neighbor-based pattern detection for windows over streaming data. In: EDBT 2009, pp. 529–540 (2009)Google Scholar
  11. 11.
    Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: The 27th International Conference on Data Engineering (ICDE), pp. 135–146 (2011)Google Scholar
  12. 12.
    Kontaki, M., Gounarisn, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Effcient and flexible algorithms for monitoring distance based outliers over data streams. Inf. Syst. 55(C), 37–53 (2016)CrossRefGoogle Scholar
  13. 13.
    Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density based local outliers. In: 2000 ACM SIGMOD International Conference on Management of Data, vol. 29(2), pp. 93–104 (2000)Google Scholar
  14. 14.
    Pokrajac, D., Lazarevic, A., Latecki, L.J.: Incremental local outlier detection for data streams. In: IEEE Symposium on Computational Intelligence and Data Mining, pp. 504–515 (2007)Google Scholar
  15. 15.
    Christopher, T., Divya, M.T.: A comparative analysis of hierarchical and partitioning clustering algorithms for outlier detection in data streams. Int. J. Adv. Res. Comput. Commun. Eng. (2015)Google Scholar
  16. 16.
    Mathur, N., Tiwari, M., Khandelwal, S.: Increased performance factor for the best clustering algorithm. Int. J. Eng. Tech. Res. 3 (2015)Google Scholar
  17. 17.
    Yogita, T.D.: A framework for outlier detection in evolving data streams by weighting attributes in clustering. In: 2nd International Conference on Communication Computing and Security (2012)Google Scholar
  18. 18.
    Koupaie, H.M., Ibrahim, S., Hosseinkhani, J.: Outlier detection in stream data by clustering method. Int. J. Adv. Comput. Sci. Inf. Technol. 2, 25–34 (2013)Google Scholar
  19. 19.
    Assent, I., Kranen, P., Baldauf, C., Seidl, T.: AnyOut: anytime outlier detection on streaming data. In: Lee, S., Peng, Z., Zhou, X., Moon, Y.-S., Unland, R., Yoo, J. (eds.) DASFAA 2012. LNCS, vol. 7238, pp. 228–242. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-29038-1_18 CrossRefGoogle Scholar
  20. 20.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: 2006 SIAM Conference on Data Mining, pp. 328–339 (2006)Google Scholar
  21. 21.
    Li-xiong, L., Jing, K., Yun-fei, G., Hai, H.: A three-step clustering algorithm over an evolving data stream. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, pp. 160–164 (2009)Google Scholar
  22. 22.
    Kumar, M., Sharma, A.: Mining of data stream using DDenStream clustering algorithm. In: 2013 International Conference in MOOC on Data Engineering, pp. 315–320 (2013)Google Scholar
  23. 23.
    Yogita, T.D.: Unsupervised outlier detection in streaming data using weighted clustering. In: 12th International Conference on Intelligent Systems Design and Applications, pp. 160–164 (2012)Google Scholar
  24. 24.
    Gurav, R.B., Rangdale, S.: Hybrid approach for outlier detection in high dimensional dataset. Int. J. Sci. Res. 3 (2014)Google Scholar
  25. 25.
    Solaimani, M., Iftekhar, M., Khan, L.: Statistical technique for online anomaly detection using spark over heterogeneous data from multi-source VMware performance data. In: IEEE International Conference on Big Data (2014)Google Scholar
  26. 26.
    Kumar Samparthi, V.S., Verma, H.K.: Outlier detection of data in wireless sensor networks using kernel density estimation. Int. J. Comput. Appl. 5 (2010)Google Scholar
  27. 27.
    Uddin, M.S., Kuh, A., Weng, Y., Ili’c, M.: Online bad data detection using kernel density estimation. In: IEEE Power and Energy Sociaty and General Meeting (2015)Google Scholar
  28. 28.
    Tang, X., Li, G., Chen, G.: Fast detecting outliers over online data streams. In: International Conference on Information Engineering and Computer Science, pp. 1–4 (2009)Google Scholar
  29. 29.
    Lin, F., Le, W., Bo, J.: Research on maximal frequent pattern outlier factor for online high dimensional time-series outlier detection. J. Convergence Inf. Technol. 5, 66–71 (2010)Google Scholar
  30. 30.
    Dominic, D.D., Said, A.M.: Network anomaly detection approach based on frequent pattern mining technique. In: International Conference on Computational Science and Technology (2014)Google Scholar
  31. 31.
    Said, A.M., Dominic, P.D.D., Faye, L.: Data stream outlier detection approach based on frequent pattern mining technique. Int. J. Bus. Inf. Syst. 20, 55–70 (2015)Google Scholar
  32. 32.
    Zhang, Y., Meratnia, N., Havinga, P.: Outlier detection techniques for wireless sensor networks: a survey. IEEE Commun. Surv. Tutorials 12, 159–170 (2010)CrossRefGoogle Scholar
  33. 33.
    Kale, A., Ingle, M.D.: SVM based feature extraction for novel class detection from streaming data. Wireless Pers. Commun. J. 110, 1–3 (2015)Google Scholar
  34. 34.
    Masud, M.M., Gao, J., Han, J., Khan, L., Thuraising-ham, B.M.: Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans. Knowl. Data Eng. 25 (2013)Google Scholar
  35. 35.
    Uddin, M.S., Kuh, A.: Online least-squares one-class support vector machine for outlier detection in power grid data. In: IEEE International Conference on Acoustics Speech and Signal Processing (2016)Google Scholar
  36. 36.
    Ye, H., Kitagawa, H., Xia, J.: Continuous angle-based outlier detection on high-dimensional data streams. In: 19th International Database Engineering and Applications Symposium, pp. 162–167 (2015)Google Scholar
  37. 37.
    Kriegel, H.P., Hubert, M.S., Zimek, A.: Angle based outlier detection in high-dimensional data. In: 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)Google Scholar
  38. 38.
    Salperwyck, C.: Apprentissage incrémental en-ligne sur flux de données. Ph.D. thesis, University Charles de Gaulle (2012)Google Scholar
  39. 39.
    Marascu, A.: Extraction de motifs séquentiels dans les flux de données. Ph.D. thesis, Université de Nice Sophia Antipolis (2009)Google Scholar
  40. 40.
    Salehi, M., Leckie, C., Bezdek, J., Vaithianathan, T.: Local outlier detection for data streams in sensor networks: revisiting the utility problem invited paper. In: 10th International Conference on Intelligent Sensors, Sensor Networks and Information Processing (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Higher Institute of Computer Science and ManagementKairouanTunisia
  2. 2.Higher Institute of Computer Science and Communication TechniquesHammam SousseTunisia
  3. 3.RIADI-GDL LabManouba UniversityManoubaTunisia

Personalised recommendations