K-Dimensional Trees for Continuous Traffic Classification

  • Valentín Carela-Español
  • Pere Barlet-Ros
  • Marc Solé-Simó
  • Alberto Dainotti
  • Walter de Donato
  • Antonio Pescapé
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6003)


The network measurement community has proposed multiple machine learning (ML) methods for traffic classification during the last years. Although several research works have reported accuracies over 90%, most network operators still use either obsolete (e.g., port-based) or extremely expensive (e.g., pattern matching) methods for traffic classification. We argue that one of the barriers to the real deployment of ML-based methods is their time-consuming training phase. In this paper, we revisit the viability of using the Nearest Neighbor technique for traffic classification. We present an efficient implementation of this well-known technique based on multiple K-dimensional trees, which is characterized by short training times and high classification speed.This allows us not only to run the classifier online but also to continuously retrain it, without requiring human intervention, as the training data become obsolete. The proposed solution achieves very promising accuracy (> 95%) while looking just at the size of the very first packets of a flow. We present an implementation of this method based on the TIE classification engine as a feasible and simple solution for network operators.


Network Operator Packet Size Port Number Continuous Training Application Group 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bentley, J.L.: K-d trees for semidynamic point sets, pp. 187–197 (1990)Google Scholar
  2. 2.
    Bernaille, L., Teixeira, R., Salamatian, K.: Early application identification. In: Proc. of ACM CoNEXT (2006)Google Scholar
  3. 3.
    Bernaille, L., et al.: Traffic classification on the fly. ACM SIGCOMM Comput. Commun. Rev. 36(2) (2006)Google Scholar
  4. 4.
    CoMo-UPC data sharing model,
  5. 5.
    Dainotti, A., et al.: TIE: a community-oriented traffic classification platform. In: Proceedings of the First International Workshop on Traffic Monitoring and Analysis, p. 74 (2009)Google Scholar
  6. 6.
    Erman, J., Mahanti, A., Arlitt, M.: Byte me: a case for byte accuracy in traffic classification. In: Proc. of ACM SIGMETRICS MineNet (2007)Google Scholar
  7. 7.
    Erman, J., et al.: Identifying and discriminating between web and peer-to-peer traffic in the network core. In: Proc. of WWW Conf. (2007)Google Scholar
  8. 8.
    Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977)zbMATHCrossRefGoogle Scholar
  9. 9.
    Internet Assigned Numbers Authority (IANA): as of August 12 (2008),
  10. 10.
    Karagiannis, T., Papagiannaki, K., Faloutsos, M.: BLINC: multilevel traffic classification in the dark. In: Proc. of ACM SIGCOMM (2005)Google Scholar
  11. 11.
    Kim, H., et al.: Internet traffic classification demystified: myths, caveats, and the best practices. In: Proc. of ACM CoNEXT (2008)Google Scholar
  12. 12.
    Moore, A., Zuev, D.: Internet traffic classification using bayesian analysis techniques. In: Proc. of ACM SIGMETRICS (2005)Google Scholar
  13. 13.
    Nguyen, T., Armitage, G.: A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys and Tutorials 10(4) (2008)Google Scholar
  14. 14.
    Roughan, M., et al.: Class-of-service mapping for qos: a statistical signature-based approach to ip traffic classification. In: Proc. of ACM SIGCOMM IMC (2004)Google Scholar
  15. 15.
    Williams, N., Zander, S., Armitage, G.: Evaluating machine learning algorithms for automated network application identification. CAIA Tech. Rep. (2006)Google Scholar
  16. 16.
    Zander, S., Nguyen, T., Armitage, G.: Automated traffic classification and application identification using machine learning. In: Proc. of IEEE LCN Conf. (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Valentín Carela-Español
    • 1
  • Pere Barlet-Ros
    • 1
  • Marc Solé-Simó
    • 1
  • Alberto Dainotti
    • 2
  • Walter de Donato
    • 2
  • Antonio Pescapé
    • 2
  1. 1.Department of Computer ArchitectureUniversitat Politècnica de Catalunya (UPC) 
  2. 2.Department of Computer Engineering and SystemsUniversitá di Napoli Federico II 

Personalised recommendations