A Data Mining Methodology for Anomaly Detection in Network Data

  • Costantina Caruso
  • Donato Malerba
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4693)


Anomaly detection is based on profiles that represent normal behavior of users, hosts or networks and detects attacks as significant deviations from these profiles. Our methodology is based on the application of several data mining methods and returns an adaptive normal daily model of the network traffic as a result of four main steps, which are illustrated in the paper. The original observation units (the network connections) are transformed in symbolic objects and the normal model itself is given by a particular set of symbolic objects. A new symbolic object is considered an anomaly if it is dissimilar from those belonging to the model and it can be added to the model if it is ranked as a changing point, i.e. a new but legal behavior of the network traffic, otherwise it is an outlier, i.e. a new but illegal aspect of the network traffic. The obtained model of network connections can be used by a network administrator to identify deviations in network traffic patterns that may demand for her attention. The methodology is applied to the firewall logs of our Department network.


Anomaly detection ranking anomalies decision boundary adaptive model data mining machine learning event monitoring network traffic analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ghoting, A., Otey, M.E., Parthasarathy, S.: Loaded: Link-based Outlier and Anomaly detection in Evolving Data Sets. In: Proceeedings of the IEEE International Conference on Data Mining, IEEE Computer Society Press, Los Alamitos (2004)Google Scholar
  2. 2.
    Takeuchi, J., Yamanashi, K.: A Unifying Framework for Identifying Changing Points and Outliers. IEEE Transactions on Knowledge and Data Engineering 18(4) (2006)Google Scholar
  3. 3.
    Wang, K., Stolfo, S.: Anomalous Payload-based Network Intrusion Detection. In: RAID (2004)Google Scholar
  4. 4.
    Knorr, N., Ng, P.: Algorithms for Mining Distance-Based Outliers in Large Datasets. In: VLDB (1998)Google Scholar
  5. 5.
    Breunig, et al.: LOF: Identifying Density-Based Local Outliers. In: KDD (2000)Google Scholar
  6. 6.
    Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data (2002)Google Scholar
  7. 7.
    Yamanishi, K.: On-line unsupervised outlier detection using finite mixture with discounting learning algorithms. In: KDD (2000)Google Scholar
  8. 8.
    Mahoney, M., Chan, P.: Learning Nonstationary Models of Normal Network Traffic for Detecting Novel Attacks. In: 8th ACM KDD (2002)Google Scholar
  9. 9.
    Hofmeyr, S., et al.: Intrusion Detection using Sequences of System Calls (1997)Google Scholar
  10. 10.
    Tandon, G., Chan, P.: Learning Rules from System Call Arguments and Sequences for Anomaly Detection. In: Workshop on Data Mining for Computer Security. In: ICDM (2003)Google Scholar
  11. 11.
    Wang, K., Stolfo, S.: One Class Training for Masquerade Detection. In: Workshop on Data Mining for Computer Security. ICDM (2003)Google Scholar
  12. 12.
    Jain, A.K., Murty, M.N., Flyn, P.J.: Data Clustering: a Review. ACM Computing Surveys 31(3) (1999)Google Scholar
  13. 13.
    Witten, I., Frank, E.: Generate Accurate Rule Sets Without Global Optimisation. In: Machine Learning: Proceedings of the 15th International Conference, Morgan Kaufmann Publishers, San Francisco (1998)Google Scholar
  14. 14.
    Gowda, K.C., Diday, E.: Symbolic Clustering Using a New Dissimilarity Measure. Pattern Recognition 24(6), 567–578 (1991)CrossRefGoogle Scholar
  15. 15.
    Caruso, C., Malerba, D., Papagni, D.: Learning the daily model of network traffic. In: Hacid, M.-S., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS (LNAI), vol. 3488, pp. 131–141. Springer, Heidelberg (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Costantina Caruso
    • 1
  • Donato Malerba
    • 1
  1. 1.Dipartimento di Informatica, Università degli Studi di Bari, Via E. Orabona 4 - 70126 BariItaly

Personalised recommendations