On-the-fly Statistical Classification of Internet Traffic at Application Layer Based on Cluster Analysis

  • Andrea Baiocchi
  • Gianluca Maiolini
  • Giacomo Molina
  • Antonello Rizzi
Part of the Advances in Soft Computing book series (AINSC, volume 53)


We address the problem of classifying Internet packet flows according to the application level protocol that generated them. Unlike deep packet inspection, which reads up to application layer payloads and keeps track of packet sequences, we consider classification based on statistical features extracted in real time from the packet flow, namely IP packet lengths and inter-arrival times. A statistical classification algorithm is proposed, built upon the powerful and rich tools of cluster analysis. By exploiting traffic traces taken at the Networking Lab of our Department and traces from CAIDA, we defined data sets made up of thousands of flows for up to five different application protocols. With the classic approach of training and test data sets we show that cluster analysis yields very good results in spite of the little information it is based on, to stick to the real time decision requirement. We aim to show that the investigated applications are characterized from a ”signature” at the network layer that can be useful to recognize such applications even when the port number is not significant. Numerical results are presented to highlight the effect of major algorithm parameters. We discuss complexity and possible exploitation of the statistical classifier.


Interarrival Time Internet Traffic Average Classification Accuracy Traffic Classification Traffic Trace 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Karagiannis, T., Papagiannaki, K., Faloutsos, M.: BLINC: Multilevel traffic classification in the dark. In: Proc. of ACM SIGCOMM 2005, Philadelphia, PA, USA (August 2005)Google Scholar
  2. 2.
    Crotti, M., Dusi, M., Gringoli, F., Salgarelli, L.: Traffic Classification through Simple Statistical Fingerprinting. ACM SIGCOMM Computer Communication Review 37(1), 5–16 (2007)CrossRefGoogle Scholar
  3. 3.
    Wright, C., Monrose, F., Masson, G.: On Inferring Application Protocol Behaviors in Encrypted Network Traffic. Journal of Machine Learning Research (JMLR): Special issue on Machine Learning for Computer Security 7, 2745–2769 (2006)MathSciNetGoogle Scholar
  4. 4.
    Moore, A.W., Zuev, D.: Internet traffic classification using Bayesian analysis techniques. In: ACM SIGMETRICS 2005, Banff, Alberta, Canada (June 2005)Google Scholar
  5. 5.
    McGregor, A., Hall, M., Lorier, P., Brunskill, J.: Flow clustering using machine learning techniques. In: PAM 2004, Antibes Juan-les-Pins, France (April 2004)Google Scholar
  6. 6.
    Zander, S., Nguyen, T., Armitage, G.: Automated traffic classification and application identification using machine learning. In: LCN 2005, Sydney, Australia (November 2005)Google Scholar
  7. 7.
    Bernaille, L., Teixeira, R., Salamatian, K.: ’Early Application Identification. In: Proceedings of CoNEXT (December 2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Andrea Baiocchi
    • 1
  • Gianluca Maiolini
    • 2
  • Giacomo Molina
    • 1
  • Antonello Rizzi
    • 1
  1. 1.INFOCOM Dept.University of Roma “Sapienza”RomeItaly
  2. 2.ELSAG Datamat – Divisione automazione sicurezza e trasportiRomeItaly

Personalised recommendations