Advertisement

A Relevance Weighted Ensemble Model for Anomaly Detection in Switching Data Streams

  • Mahsa Salehi
  • Christopher A. Leckie
  • Masud Moshtaghi
  • Tharshan Vaithianathan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8444)

Abstract

Anomaly detection in data streams plays a vital role in on-line data mining applications. A major challenge for anomaly detection is the dynamically changing nature of many monitoring environments. This causes a problem for traditional anomaly detection techniques in data streams, which assume a relatively static monitoring environment. In an environment that is intermittently changing (known as switching data streams), static approaches can yield a high error rate in terms of false positives. To cope with dynamic environments, we require an approach that can learn from the history of normal behaviour in data streams, while accounting for the fact that not all time periods in the past are equally relevant. Consequently, we have proposed a relevance-weighted ensemble model for learning normal behaviour, which forms the basis of our anomaly detection scheme. The advantage of this approach is that it can improve the accuracy of detection by using relevant history, while remaining computationally efficient. Our solution provides a novel contribution through the use of ensemble techniques for anomaly detection in switching data streams. Our empirical results on real and synthetic data streams show that we can achieve substantial improvements compared to a recent anomaly detection algorithm for data streams.

Keywords

Anomaly detection Ensemble models Data streams 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: SIGKDD, pp. 226–235. ACM (2003)Google Scholar
  2. 2.
    Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: SIGKDD, pp. 139–148. ACM (2009)Google Scholar
  3. 3.
    Rajasegarar, S., Leckie, C., Palaniswami, M.: Anomaly detection in wireless sensor networks. IEEE Wireless Communications 15(4), 34–40 (2008)CrossRefGoogle Scholar
  4. 4.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM Computing Surveys 41(3), 1–58 (2009)CrossRefGoogle Scholar
  5. 5.
    Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data: A survey. Knowledge and Data Eng. 25(1), 1–20 (2013)CrossRefGoogle Scholar
  6. 6.
    Pokrajac, D., Lazarevic, A., Latecki, L.J.: Incremental local outlier detection for data streams. In: CIDM, pp. 504–515. IEEE (2007)Google Scholar
  7. 7.
    Yamanishi, K., Takeuchi, J.I., Williams, G., Milne, P.: On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. In: SIGKDD, pp. 320–324. ACM (2000)Google Scholar
  8. 8.
    Yamanishi, K., Takeuchi, J.I.: A unifying framework for detecting outliers and change points from non-stationary time series data. In: SIGKDD, pp. 676–681. ACM (2002)Google Scholar
  9. 9.
    Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: VLDB, pp. 81–92. VLDB Endowment (2003)Google Scholar
  10. 10.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: SIAM Conf. on Data Mining, pp. 328–339 (2006)Google Scholar
  11. 11.
    Aggarwal, C.C.: A segment-based framework for modeling and mining data streams. Knowledge and Inf. Sys. 30(1), 1–29 (2012)CrossRefGoogle Scholar
  12. 12.
    Knox, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: VLDB, pp. 392–403. Citeseer (1998)Google Scholar
  13. 13.
    Angiulli, F., Fassetti, F.: Detecting distance-based outliers in streams of data. In: CIKM, pp. 811–820. ACM (2007)Google Scholar
  14. 14.
    Yang, D., Rundensteiner, E.A., Ward, M.O.: Neighbor-based pattern detection for windows over streaming data. In: Advances in DB Tech., pp. 529–540. ACM (2009)Google Scholar
  15. 15.
    Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM SIGMOD, vol. 29, pp. 93–104. ACM (2000)Google Scholar
  16. 16.
    Vu, N.H., Gopalkrishnan, V., Namburi, P.: Online outlier detection based on relative neighbourhood dissimilarity. In: Bailey, J., Maier, D., Schewe, K.-D., Thalheim, B., Wang, X.S. (eds.) WISE 2008. LNCS, vol. 5175, pp. 50–61. Springer, Heidelberg (2008)Google Scholar
  17. 17.
    Lazarevic, A., Kumar, V.: Feature bagging for outlier detection. In: SIGKDD, pp. 157–166. ACM (2005)Google Scholar
  18. 18.
    Aggarwal, C.C.: Outlier ensembles: Position paper. SIGKDD Explorations Newsletter 14(2), 49–58 (2013)CrossRefGoogle Scholar
  19. 19.
    Moshtaghi, M., Rajasegarar, S., Leckie, C., Karunasekera, S.: An efficient hyperellipsoidal clustering algorithm for resource-constrained environments. Pattern Recognition 44(9), 2197–2209 (2011)CrossRefGoogle Scholar
  20. 20.
    Moshtaghi, M., Havens, T.C., Bezdek, J.C., Park, L., Leckie, C., Rajasegarar, S., Keller, J.M., Palaniswami, M.: Clustering ellipses for anomaly detection. Pattern Recognition 44(1), 55–69 (2011)CrossRefzbMATHGoogle Scholar
  21. 21.
    Achtert, E., Goldhofer, S., Kriegel, H.P., Schubert, E., Zimek, A.: Evaluation of clusterings–metrics and visual support. In: ICDE, pp. 1285–1288. IEEE (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Mahsa Salehi
    • 1
  • Christopher A. Leckie
    • 1
  • Masud Moshtaghi
    • 2
  • Tharshan Vaithianathan
    • 3
  1. 1.National ICT Australia, Department of Computing and Information SystemsThe University of MelbourneAustralia
  2. 2.Faculty of Information TechnologyMonash UniversityAustralia
  3. 3.National ICT Australia, Department of Electrical and Electronic EngineeringThe University of MelbourneAustralia

Personalised recommendations