Retraining Mechanism for On-Line Peer-to-Peer Traffic Classification

  • Roozbeh Zarei
  • Alireza Monemi
  • Muhammad Nadzir Marsono
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 182)


Peer-to-Peer (P2P) detection using machine learning (ML) classification is affected by its training quality and recency. In this paper, a practical retraining mechanism is proposed to retrain an on-line P2P ML classifier with the changes in network traffic behavior. This mechanism evaluates the accuracy of the on-line P2P ML classifier based on the training datasets containing flows labeled by a heuristic based training dataset generator. The on-line P2P ML classifier is retrained if its accuracy falls below a predefined threshold. The proposed system has been evaluated on traces captured from the Universiti Teknologi Malaysia (UTM) campus network between October and November 2011. The overall results shows that the training dataset generation can generate accurate training dataset by classifying P2P flows with high accuracy (98.47%) and low false positive (1.37%). The on-line P2P ML classifier which is built based on J48 algorithm which has been demonstrated to be capable of self-retraining over time.


Peer-to-peer machine learning traffic classification self-retraining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bernaille, L., Teixeira, R., Salamatian, K.: Early application identification. In: Proceedings of the 2006 ACM CoNEXT conference (CoNEXT 2006), Lisboa, Portugal, pp. 6:1–6:12 (2006)Google Scholar
  2. 2.
    Chen, Z., Yang, B., Chen, Y., Abraham, A., Grosan, C., Peng, L.: Online hybrid traffic classifier for peer-to-peer systems based on network processors. Applied Soft Computing 9(2), 685–694 (2009)CrossRefGoogle Scholar
  3. 3.
    Hassan, M., Marsono, M.: A three-class heuristics technique: Generating training corpus for peer-to-peer traffic classification. In: Proceedings of the 2010 IEEE 4th International Conference on Internet Multimedia Services Architecture and Application (IMSAA 2010), pp. 1–5 (2010)Google Scholar
  4. 4.
    John, W., Tafvelin, S.: Heuristics to classify internet backbone traffic based on connection patterns. In: ICOIN 2008: 22nd International Conference on Information Networking, pp. 1–5 (2008)Google Scholar
  5. 5.
    Karagiannis, T., Broido, A., Faloutsos, M., Claffy, K.: Transport layer identification of P2P traffic. In: Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement, Taormina, Sicily, Italy, pp. 121–134 (2004)Google Scholar
  6. 6.
    Li, W., Moore, A.W.: A machine learning approach for efficient traffic classification. In: Proceedings of 15th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, Washington, DC, USA, pp. 310–317 (2007)Google Scholar
  7. 7.
    Madhukar, A., Williamson, C.: A longitudinal study of P2P traffic classification. In: Proceedings of the 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2007), Washington, DC, USA, pp. 179–188 (2006)Google Scholar
  8. 8.
    Moore, A.W., Papagiannaki, K.: Toward the Accurate Identification of Network Applications. In: Dovrolis, C. (ed.) PAM 2005. LNCS, vol. 3431, pp. 41–54. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Mula-Valls, O.: A practical retraining mechanism for network traffic classification in operational environments. Master thesis, Universitat Politècnica de Catalunya (2011)Google Scholar
  10. 10.
    Nguyen, T., Armitage, G.: A Survey of Techniques for Internet Traffic Classification using Machine Learning. IEEE Communications Surveys & Tutorials 10(4), 56–76 (2008)CrossRefGoogle Scholar
  11. 11.
    Perényi, M., Dang, T.D., Gefferth, A., Molnár, S.: Identification and analysis of peer-to-peer traffic. Journal of Communication 1(7), 36–46 (2006)Google Scholar
  12. 12.
    Raahemi, B., Hayajneh, A., Rabinovitch, P.: Peer-to-peer IP traffic classification using decision tree and IP layer attributes. International Journal of Business Data Communications and Networking 3(4), 60–74 (2007)CrossRefGoogle Scholar
  13. 13.
    Sen, S., Wang, J.: Analyzing peer-to-peer traffic across large networks. IEEE/ACM Transaction on Networking 12, 219–232 (2004)CrossRefGoogle Scholar
  14. 14.
    Sen, S., Spatscheck, O., Wang, D.: Accurate, scalable in-network identification of P2P traffic using application signatures. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 512–521. ACM, New York (2004)CrossRefGoogle Scholar
  15. 15.
    Soysal, M., Schmidt, E.G.: Machine learning algorithms for accurate flow-based network traffic classification: Evaluation and comparison. Performance Evaluation 67(6), 451–467 (2010)CrossRefGoogle Scholar
  16. 16.
    Tian, X., Sun, Q., Huang, X., Ma, Y.: A dynamic online traffic classification methodology based on data stream mining. In: 2009 WRI World Congress on Computer Science and Information Engineering, vol. 1, pp. 298–302 (2009)Google Scholar
  17. 17.
    Williams, N., Zander, S., Armitage, G.: A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. SIGCOMM Computer Communication Review 36(5), 5–16 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Roozbeh Zarei
    • 1
  • Alireza Monemi
    • 1
  • Muhammad Nadzir Marsono
    • 1
  1. 1.Faculty of Electrical EngineeringUniversiti Teknologi MalaysiaJohor BahruMalaysia

Personalised recommendations