Advertisement

Relational Frequent Patterns Mining for Novelty Detection from Data Streams

  • Michelangelo Ceci
  • Annalisa Appice
  • Corrado Loglisci
  • Costantina Caruso
  • Fabio Fumarola
  • Carmine Valente
  • Donato Malerba
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5632)

Abstract

We face the problem of novelty detection from stream data, that is, the identification of new or unknown situations in an ordered sequence of objects which arrive on-line, at consecutive time points. We extend previous solutions by considering the case of objects modeled by multiple database relations. Frequent relational patterns are efficiently extracted at each time point, and a time window is used to filter out novelty patterns. An application of the proposed algorithm to the problem of detecting anomalies in network traffic is described and quantitative and qualitative results obtained by analyzing real stream of data collected from the firewall logs are reported.

Keywords

Data Stream Anomaly Detection Reference Object Relational Pattern Novelty Detection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) International Conference on Management of Data, pp. 207–216 (1993)Google Scholar
  2. 2.
    Appice, A., Ceci, M., Malgieri, C., Malerba, D.: Discovering relational emerging patterns. In: Basili, R., Pazienza, M.T. (eds.) AI*IA 2007. LNCS (LNAI), vol. 4733, pp. 206–217. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS 2002: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 1–16. ACM, New York (2002)CrossRefGoogle Scholar
  4. 4.
    Caruso, C., Malerba, D., Papagni, D.: Learning the daily model of network traffic. In: Hacid, M.-S., Murray, N.V., Ras, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS, vol. 3488, pp. 131–141. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Ceci, M., Appice, A., Caruso, C., Malerba, D.: Discovering emerging patterns for anomaly detection in network connection data. In: An, A., Matwin, S., Ras, Z.W., Slezak, D. (eds.) ISMIS 2008. LNCS, vol. 4994, pp. 179–188. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: International Conference on Knowledge Discovery and Data Mining, pp. 43–52. ACM Press, New York (1999)Google Scholar
  7. 7.
    Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Prieditis, A., Russell, S. (eds.) Proceedings of the Twelfth International Conference on Machine Learning, pp. 194–202 (1995)Google Scholar
  8. 8.
    Džeroski, S., Lavrač, N.: Relational Data Mining. Springer, Heidelberg (2001)CrossRefzbMATHGoogle Scholar
  9. 9.
    Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. SIGMOD Rec. 34(2), 18–26 (2005)CrossRefzbMATHGoogle Scholar
  10. 10.
    Keogh, E., Lonardi, S., Chiu, B.Y.-C.: Finding surprising patterns in a time series database in linear time and space. In: KDD 2002: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 550–556. ACM, New York (2002)Google Scholar
  11. 11.
    Ma, J., Perkins, S.: Online novelty detection on temporal sequences. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 613–618. ACM, New York (2003)Google Scholar
  12. 12.
    Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)CrossRefGoogle Scholar
  13. 13.
    Markou, M., Singh, S.: Novelty detection: a review—part 1: statistical approaches. Signal Process. 83(12), 2481–2497 (2003)CrossRefzbMATHGoogle Scholar
  14. 14.
    Plotkin, G.D.: A note on inductive generalization. Machine Intelligence 5, 153–163 (1970)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Spinosa, E.J., de Carvalho, A.P.d.L.F., Gama, J.: Cluster-based novel concept detection in data streams applied to intrusion detection in computer networks. In: SAC 2008: Proceedings of the 2008 ACM symposium on Applied computing, pp. 976–980. ACM, New York (2008)CrossRefGoogle Scholar
  16. 16.
    Tsumoto, S., Hirano, S.: Visualization of similarities and dissimilarities in rules using multidimensional scaling. In: Hacid, M.-S., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS, vol. 3488, pp. 38–46. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Zhang, X., Dong, G., Kotagiri, R.: Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets. In: Knowledge Discovery and Data Mining, pp. 310–314 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Michelangelo Ceci
    • 1
  • Annalisa Appice
    • 1
  • Corrado Loglisci
    • 1
  • Costantina Caruso
    • 1
  • Fabio Fumarola
    • 1
  • Carmine Valente
    • 1
  • Donato Malerba
    • 1
  1. 1.Dipartimento di InformaticaUniversità degli Studi di BariBariItaly

Personalised recommendations