Retraining Mechanism for On-Line Peer-to-Peer Traffic Classification
Peer-to-Peer (P2P) detection using machine learning (ML) classification is affected by its training quality and recency. In this paper, a practical retraining mechanism is proposed to retrain an on-line P2P ML classifier with the changes in network traffic behavior. This mechanism evaluates the accuracy of the on-line P2P ML classifier based on the training datasets containing flows labeled by a heuristic based training dataset generator. The on-line P2P ML classifier is retrained if its accuracy falls below a predefined threshold. The proposed system has been evaluated on traces captured from the Universiti Teknologi Malaysia (UTM) campus network between October and November 2011. The overall results shows that the training dataset generation can generate accurate training dataset by classifying P2P flows with high accuracy (98.47%) and low false positive (1.37%). The on-line P2P ML classifier which is built based on J48 algorithm which has been demonstrated to be capable of self-retraining over time.
KeywordsPeer-to-peer machine learning traffic classification self-retraining
Unable to display preview. Download preview PDF.
- 1.Bernaille, L., Teixeira, R., Salamatian, K.: Early application identification. In: Proceedings of the 2006 ACM CoNEXT conference (CoNEXT 2006), Lisboa, Portugal, pp. 6:1–6:12 (2006)Google Scholar
- 3.Hassan, M., Marsono, M.: A three-class heuristics technique: Generating training corpus for peer-to-peer traffic classification. In: Proceedings of the 2010 IEEE 4th International Conference on Internet Multimedia Services Architecture and Application (IMSAA 2010), pp. 1–5 (2010)Google Scholar
- 4.John, W., Tafvelin, S.: Heuristics to classify internet backbone traffic based on connection patterns. In: ICOIN 2008: 22nd International Conference on Information Networking, pp. 1–5 (2008)Google Scholar
- 5.Karagiannis, T., Broido, A., Faloutsos, M., Claffy, K.: Transport layer identification of P2P traffic. In: Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement, Taormina, Sicily, Italy, pp. 121–134 (2004)Google Scholar
- 6.Li, W., Moore, A.W.: A machine learning approach for efficient traffic classification. In: Proceedings of 15th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, Washington, DC, USA, pp. 310–317 (2007)Google Scholar
- 7.Madhukar, A., Williamson, C.: A longitudinal study of P2P traffic classification. In: Proceedings of the 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2007), Washington, DC, USA, pp. 179–188 (2006)Google Scholar
- 9.Mula-Valls, O.: A practical retraining mechanism for network traffic classification in operational environments. Master thesis, Universitat Politècnica de Catalunya (2011)Google Scholar
- 11.Perényi, M., Dang, T.D., Gefferth, A., Molnár, S.: Identification and analysis of peer-to-peer traffic. Journal of Communication 1(7), 36–46 (2006)Google Scholar
- 16.Tian, X., Sun, Q., Huang, X., Ma, Y.: A dynamic online traffic classification methodology based on data stream mining. In: 2009 WRI World Congress on Computer Science and Information Engineering, vol. 1, pp. 298–302 (2009)Google Scholar