Advertisement

Outlier Detection in Data Streams Using OLAP Cubes

  • Felix Heine
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 767)

Abstract

Outlier detection is an important tool for many application areas. Often, data has some multidimensional structure so that it can be viewed as OLAP cubes. Exploiting this structure systematically helps to find outliers otherwise undetectable. In this paper, we propose an approach that treats streaming data as a series of OLAP cubes. We then use an offline calculated model of the cube’s expected behavior to find outliers in the data stream. Furthermore, we aggregate multiple outliers found concurrently at different cells of the cube to some user-defined level in the cube. We apply our method to network data to find attacks in the data stream to show its usefulness.

References

  1. 1.
    Aggarwal, C.C.: Outlier Analysis, 1st edn. Springer, New York (2013). doi: 10.1007/978-1-4614-6396-2 CrossRefzbMATHGoogle Scholar
  2. 2.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009). http://doi.acm.org/10.1145/1541880.1541882 CrossRefGoogle Scholar
  3. 3.
    Dunstan, N., Despi, I., Watson, C.: Anomalies in multidimensional contexts. WIT Transa. Inform. Commun. Technol. 42, 173 (2009). http://www.witpress.com/elibrary/wit-transactions-on-information-and-communication-technologies/42/19978 CrossRefGoogle Scholar
  4. 4.
    Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream cube: an architecture for multi-dimensional analysis of data streams. Distrib. Parallel Databases 18(2), 173–197 (2005). http://link.springer.com/article/10.1007/s10619-005-3296-1 CrossRefGoogle Scholar
  5. 5.
    Li, X., Han, J.: Mining approximate top-k subspace anomalies in multi-dimensional time-series data. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 447–458. VLDB Endowment (2007)Google Scholar
  6. 6.
    Lin, S., Brown, D.E.: Outlier-based Data Association: Combining OLAP and Data Mining. Department of Systems and Information Engineering University of Virginia, Charlottesville, VA 22904 (2002). http://web.sys.virginia.edu/files/tech_papers/2002/sie-020011.pdf
  7. 7.
    Lippmann, R., Fried, D., Graf, I., Haines, J., Kendall, K., McClung, D., Weber, D., Webster, S., Wyschogrod, D., Cunningham, R., Zissman, M.: Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation. In: DARPA Information Survivability Conference and Exposition, DISCEX 2000, Proceedings, vol. 2, pp. 12–26 (2000)Google Scholar
  8. 8.
    Müller, E., Assent, I., Iglesias, P., Mülle, Y., Böhm, K.: Outlier ranking via subspace analysis in multiple views of the data. In: 2012 IEEE 12th International Conference on Data Mining, pp. 529–538. IEEE (2012). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6413873
  9. 9.
    Palpanas, T., Koudas, N., Mendelzon, A.: Using datacube aggregates for approximate querying and deviation detection. IEEE Trans. Knowl. Data Eng. 17(11), 1465–1477 (2005)CrossRefGoogle Scholar
  10. 10.
    Rettig, L., Khayati, M., Cudré-Mauroux, P., Piórkowski, M.: Online anomaly detection over Big Data streams. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 1113–1122 (2015)Google Scholar
  11. 11.
    Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery-driven exploration of OLAP data cubes. In: Schek, H.-J., Alonso, G., Saltor, F., Ramos, I. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 168–182. Springer, Heidelberg (1998). doi: 10.1007/BFb0100984 Google Scholar
  12. 12.
    Sithirasenan, E., Muthukkumarasamy, V.: Substantiating anomalies in wireless networks using group outlier scores. J. Softw. 6(4), 678–689 (2011)CrossRefGoogle Scholar
  13. 13.
    Thatte, G., Mitra, U., Heidemann, J.: Parametric methods for anomaly detection in aggregate traffic. IEEE/ACM Trans. Networking 19(2), 512–525 (2011)CrossRefGoogle Scholar
  14. 14.
    Xin, D., Han, J., Li, X., Wah, B.W.: Star-cubing: Computing iceberg cubes by top-down and bottom-up integration. In: Proceedings of the 29th International Conference on Very Large Data Bases, vol. 29, pp. 476–487. VLDB Endowment (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer Science, Faculty IVHannover University of Applied Sciences and ArtsHannoverGermany

Personalised recommendations