Abstract
Accurate classification of network traffic can offer substantial benefits to service differentiation, enforcement of security policies and traffic engineering for network operators and service providers. Machine learning algorithms have been used to classify network traffic with good results. However, not knowing the confidence of these classifications makes it difficult to measure and control the risk of error using a decision rule. Modern network resource management systems are becoming increasingly complex and as such require high quality, reliable predictions with confidence measures. These reliability measures allow service provider and network carrier to effectively perform a cost-benefit evaluation of alternative actions and optimise network performance such as delay and information loss. In this paper, we consider the problem of reliable network traffic classification. Two recently developed machine learning methods, namely Conformal Predictor and Venn Probability Machine, are presented for application in network traffic classification. These two methods are based on the identically independently distributed sequence of data instances assumption. Experiments on publicly available real network traffic datasets in the on-line setting show these two methods can perform well and produce reliable classifications. Comparison is also made between these two methods.
Article PDF
Similar content being viewed by others
References
Auld T., Moore A.W., Gull S.: Bayesian neural networks for internet traffic classification. IEEE Trans. Neural Netw. 18(1), 223–239 (2007)
Bellotti T., Luo Z., Gammerman A., van Delft F.W., Saha V.: Qualified predictions for microarray and proteomics pattern diagnostics with confidence machines. Int. J. Neural Syst. 15(4), 1–12 (2005)
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines (and other Kernel based learning methods). Cambridge University Press, Cambridge (2000)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall/CRC, London (2003)
Dashevskiy M., Luo Z.: Reliable probabilistic classification of internet traffic. Int. J. Inf. Acquis. 6(2), 133–146 (2009)
Dashevskiy, M., Luo, Z.: Predictions with confidence in applications. In: The 6th international conference on machine learning and data mining (MLDM 2009), pp. 775–786 (2009)
Dashevskiy, M., Luo, Z.: Reliable probabilistic classification and its application to internet traffic. ICIC (1), 380–388 (2008)
Devroye L., Gyorfi L., Lugosi G.: A Probabilistic Theory of Pattern Recognition. Springer, New York (1996)
Erman, J., Mahanti, A., Arlitt, M.: Traffic classification using clustering algorithms. In: Proceedings of the 2006 SIGCOMM workshop on Mining network data, pp. 281–286 (2006)
Este A., Gringoli F., Salgarelli L.: Support vector machines for TCP traffic classification. Comput. Netw. 53(14), 2476–2490 (2009)
Floyd S., Warmuth M.K.: Complete compression, learnability, and the Vapnik-Chervonenkis dimension. Mach. Learn. 21, 269–304 (1995)
Gammerman A., Vovk V.: Hedging predictions in machine learning: the second computer journal lecture. Comput. J. 50(2), 151–163 (2007)
Gu C., Zhang S., Xue X.: Encrypted internet traffic classification method based on host behavior. Int. J. Digit. Content Technol. Appl. 5(3), 167–174 (2011)
Karagiannis, T., Papagiannaki, K., Faloutsos, M.: BLINC: Multilevel traffic classification in the dark. In: Proc. of ACM SIGCOMM, pp. 229–240 (2005)
Kim, H., Claffy, K.C., Fomenkov, M., Barman, D., Faloutsos, M., Lee, K.: Internet traffic classification demystified: myths, caveats, and the best practices. In: Proceedings of the 2008 ACM CoNEXT Conference, New York, USA, Article 11 (2008)
Korb, K.B.: Calibration and the evaluation of predictive learners. In: Proceedings of sixteenth international joint conference on artificial intelligence, pp. 73–77 (1999)
Lambrou A., Papadopoulos H., Gammerman A.: Reliable confidence measures for medical diagnosis with evolutionary algorithms. IEEE Trans. Inf. Technol. Biomed. 15(1), 93–99 (2011)
Li, W., Abdin, K., Dann, R., Moore, A.: Approaching real-time network traffic classification. Technical Report, RR-06-12, Dept of Computer Science, Queen Mary, University of London (2006)
Li W., Canini M., Moore A.W., Bolla R.: Efficient application identification and the temporal and spatial stability of classification schema. Comput. Netw. 53(6), 790–809 (2009)
Littlestone, N., Warmuth, M.K.: Relating data compression and learnability. Technical report, University of California, Santa Cruz (1986)
Madhukar, A., Williamson, C.: A Longitudinal Study of P2P Traffic Classification. In: Proceedings of the 14th IEEE international symposium on modeling, analysis, and simulation (MASCOTS’06), pp. 179–188 (2006)
Mitchell, T.M.: Machine Learning. McGrow-Hill, New York (1997)
Moore, A.W., Zuev, D.: Internet traffic classification using Bayesian analysis techniques. SIGMETRICS, 50–60 (2005)
Moore, A.W., Papagiannaki, D.: Toward the accurate identification of network applications. In: Proceedings of the sixth passive and active measurement workshop, pp. 50–60 (2005)
Moore, A.W., Zuev, D., Crogan, M.: Discriminators for use in flow-based classification, Technical Report RR-05-13, Dept. of Computer Science, Queen Mary, University of London (2005)
Papatheocharous, E., Papadopoulos, H., Andreou, A.S.: Software Effort Estimation with Ridge Regression and Evolutionary Attribute Selection. In: Proceedings of the 3rd workshop on artificial intelligence techniques in software engineering (2010)
Sen, S., Spatscheck, O., Wang, D.: Accurate, scalable in-network identification of P2P traffic using application signatures. In: Proceedings of the 13th international conference on world wide web (WWW’ 04), New York, USA, pp. 512–521 (2004)
Vapnik V.N.: Statistical Learning Theory. Wiley, New York (1998)
Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, Berlin (2005)
Yoon, S.-H., Park, J.-W., Park, J.-S., Oh, Y.-S., Kim, M.-S.: Internet application traffic classification using fixed IP-port. LNCS, vol. 5787, pp. 21–30 (2009)
Zander, S., Nguyen, T., Armitage, G.: Automated traffic classification and application identification using machine learning. In: The IEEE Conference on Local Computer Networks, pp. 250–257 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dashevskiy, M., Luo, Z. Two methods for reliable classification of network traffic. Prog Artif Intell 1, 223–234 (2012). https://doi.org/10.1007/s13748-012-0019-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13748-012-0019-5