Soft Computing

, Volume 21, Issue 8, pp 2035–2046 | Cite as

Flexible neural trees based early stage identification for IP traffic

  • Zhenxiang Chen
  • Lizhi Peng
  • Chongzhi Gao
  • Bo Yang
  • Yuehui Chen
  • Jin Li
Methodologies and Application

Abstract

Identifying network traffics at their early stages accurately is very important for network management and security. Recent years, more and more studies have devoted to find effective machine learning models to identify traffics with few packets at the early stage. In this paper, we try to build an effective early stage traffic identification model by applying flexible neural trees (FNT). Three network traffic data sets including two open data sets are used for the study. We first extract both packet-level features and statistical features from the first six continuous packets and six noncontinuous packets of each flow. Packet sizes are applied as packet-level features. And for statistical features, average, standard deviation, maximum and minimum are selected. Eight classical classifiers are employed as the comparing methods in the identification experiments. Accuracy, true positive rate (TPR) and false positive rate (FPR) are applied to evaluate the performances of the compared methods. FNT outperforms the other methods for most cases in the identification experiments, and it behaves very well for both TPR and FPR. Furthermore, it can show the selected features in the optimal tree result. Experiment result shows that FNT is effective for early stage traffic identification.

Keywords

Early stage traffic identification Flexible neural trees Machine learning 

References

  1. Baluja S (1994) Population-based incremental learning: a method for integrating genetic search based function optimization and competitive learning. Tech report, Carnegie Mellon UniversityGoogle Scholar
  2. Bernaille L, Teixeira R, Akodkenou I, Soule A, Salamatian K (2006) Traffic classification on the fly. ACM SIGCOMM Comput Commun Rev 36(2):23–26CrossRefGoogle Scholar
  3. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MATHGoogle Scholar
  4. Broomhead DS, Lowe D (1988) Multivariable functional interpolation and adaptive networks. Proc Ijcai 2(3):321–355MathSciNetMATHGoogle Scholar
  5. Chen Y, Yang B, Dong J (2004) Nonlinear system modelling via optimal design of neural trees. Int J Neural Syst 14(02):125–137CrossRefGoogle Scholar
  6. Chen Y, Yang B, Dong J, Abraham A (2005) Time-series forecasting using flexible neural tree model. Inf Sci 174(3):219–235MathSciNetCrossRefGoogle Scholar
  7. Chen Y, Chen F, Yang JY (2007a) Evolving mimo flexible neural trees for nonlinear system identification. In: IC-AI2007, CSREA Press pp 373–377Google Scholar
  8. Chen Y, Yang B, Abraham A (2007b) Flexible neural trees ensemble for stock index modeling. Neurocomputing 70(4):697–703CrossRefGoogle Scholar
  9. Chen Z, Yang B, Chen Y, Abraham A, Grosan C, Peng L (2009) Online hybrid traffic classifier for peer-to-peer systems based on network processors. Appl Soft Comput 9:685–694CrossRefGoogle Scholar
  10. Dainotti A, Pescapé A, Sansone C (2011) Early classification of network traffic through multi-classification. Springer, New York, pp 122–135Google Scholar
  11. Dainotti A, Pescape A, Claffy KC (2012) Issues and future directions in traffic classification. Netw IEEE 26(1):35–40CrossRefGoogle Scholar
  12. Dainottia A, Pescap A, Rossib PS, Palmieric F, Ventrea G (2008) Internet traffic modeling by means of hidden markov models. Comput Netw 52:2645–2662CrossRefGoogle Scholar
  13. Esposito C, Ficco M, Palmieri F, Castiglione A (2013) Interconnecting federated clouds by using publish-subscribe service. Clust Comput 16:887–903CrossRefGoogle Scholar
  14. Esposito C, Ficco M, Palmieri F, Castiglione A (2015) Smart cloud storage service selection based on fuzzy logic, theory of evidence and game theory. IEEE Trans Comput. doi:10.1109/TC.2015.2389952
  15. Este A, Gringoli F, Salgarelli L (2009a) On the stability of the information carried by traffic flow features at the packet level. ACM SIGCOMM Comput Commun Rev 39(3):13–18CrossRefMATHGoogle Scholar
  16. Este A, Gringoli F, Salgarelli L (2009b) Support vector machines for tcp traffic classification. Comput Netw Int J Comput Telecommun Netw 53(14):2476–2490MATHGoogle Scholar
  17. Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Morgan Kaufmann (ed) Proceeding of International Conference on Machine Learning, pp 144–151Google Scholar
  18. Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163CrossRefMATHGoogle Scholar
  19. Gringoli F, Salgarelli L, Dusi M, Cascarano N, Risso F et al (2009) Gt: picking up the truth from the ground for internet traffic. ACM SIGCOMM Comput Commun Rev 39(5):12–18CrossRefGoogle Scholar
  20. Holte RC (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learning 11(1):63–90CrossRefMATHGoogle Scholar
  21. Huang NF, Jai GY, Chao HC (2008) Early identifying application traffic with application characteristics. In: Communications, ICC’08. IEEE international conference on, IEEE, pp 5788–5792Google Scholar
  22. Huang NF, Jai GY, Chao HC, Tzang YJ, Chang HY (2013) Application traffic classification at the early stage by characterizing application rounds. Inf Sci 232:130–142CrossRefGoogle Scholar
  23. Hullár B, Laki S, György A (2011) Early identification of peer-to-peer traffic. In: Communications (ICC), 2011 IEEE international conference on, IEEE, pp 1–6Google Scholar
  24. Jacobsen V, Leres C, McCanne S (2005) Tcpdump/libpcap. http://www.tcpdump.org
  25. Kennedy J (1999) Small worlds and mega-minds: effects of neighborhood topology on particle swarm performance. In: Evolutionary computation, CEC 99, proceedings of the 1999 congress on, IEEE, vol 3Google Scholar
  26. Kennedy J (2010) Particle swarm optimization. Swarm Intell 1(1):33–57Google Scholar
  27. Kohavi R (1996) Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: KDD’96, AAAI Press, pp 202–207Google Scholar
  28. Krink T, VesterstrOm JS, Riget J (2002) Particle swarm optimisation with spatial particle extension. In: CEC’02, IEEE, pp 1474–1479Google Scholar
  29. Li W, Moore AW (2007) A machine learning approach for efficient traffic classification. In: Modeling, analysis, and simulation of computer and telecommunication systems. MASCOTS’07. 15th international symposium on, IEEE, pp 310–317Google Scholar
  30. Moore A, Zuev D, Crogan M (2005) Discriminators for use in flow-based classification. Tech report, Queen Mary and Westfield CollegeGoogle Scholar
  31. Moore AW, Zuev D (2005) Internet traffic classification using bayesian analysis techniques. ACM SIGMETRICS Perform Eval Rev ACM 33:50–60CrossRefGoogle Scholar
  32. Musilek P, Lau A, Reformat M, Wyard-Scott L (2006) Immune programming. Inf Sci 176(8):972–1002MathSciNetCrossRefMATHGoogle Scholar
  33. Nguyen TT, Armitage G, Branch P, Zander S (2012) Timely and continuous machine-learning-based classification for interactive ip traffic. IEEE/ACM Trans Netw (TON) 20(6):1880–1894CrossRefGoogle Scholar
  34. NTW (2009) Unibs: data sharing. http://www.ing.unibs.it/ntw/tools/traces/
  35. Palmieri F, Fiore U (2009) A nonlinear, recurrence-based approach to traffic classification. Comput Netw 53:761–773CrossRefMATHGoogle Scholar
  36. Peng L, Yang B, Chen Y, Wu T (2014a) How many packets are most effective for early stage traffic identification: an experimental study. Commun Chin 11(9):183–193CrossRefGoogle Scholar
  37. Peng L, Zhang H, Yang B, Chen Y, Wu T (2014b) Traffic labeller: collecting internet traffic samples with accurate application information. Commun Chin 11(1):69–78CrossRefGoogle Scholar
  38. Qu B, Zhang Z, Guo L, Meng D (2012) On accuracy of early traffic classification. In: Networking, architecture and storage (NAS), 2012 IEEE 7th international conference on, IEEE, pp 348–354Google Scholar
  39. Qu SN, Liu Zl, Cui G, Zhang B, Wang S (2008) Modeling of cement decomposing furnace production process based on flexible neural tree. In: Information management, innovation management and industrial engineering, ICIII’08. international conference on, IEEE, vol 3, pp 128–133Google Scholar
  40. Rizzi A, Colabrese S, Baiocchi A (2013) Low complexity, high performance neuro-fuzzy system for internet traffic flows early classification. In: Wireless Communications and Mobile Computing Conference (IWCMC), 2013 9th international, IEEE, pp 77–82Google Scholar
  41. Salustowicz R, Schmidhuber J (1997) Probabilistic incremental program evolution. Evol Comput 5(2):123–141CrossRefGoogle Scholar
  42. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and qsar modeling. J Chem Inf Comput Sci 43(6):1947–1958CrossRefGoogle Scholar
  43. Vapnik VN, Vapnik V (1998) Statistical learning theory, vol 1. Wiley, New YorkMATHGoogle Scholar
  44. Waikato U (2013) Weka 3: data mining software in java. http://www.cs.waikato.ac.nz/ml/weka/
  45. WAND (2009) Wits: Waikato internet traffic storage. http://www.wand.net.nz/wits
  46. Yoshida H, Kawata K, Fukuyama Y, Takayama S, Nakanishi Y (2000) A particle swarm optimization for reactive power and voltage control considering voltage security assessment. Power Syst IEEE Trans 15(4):1232–1239CrossRefGoogle Scholar
  47. Zhang J, Chen X, Xiang Y, Wu J (2014) Robust network traffic classification. IEEE/ACM Trans Netw 24:84–88Google Scholar
  48. Zhou J, Liu Y, Chen Y (2007) Ica based on kpca and hybrid flexible neural tree to face recognition. In: Computer information systems and industrial management applications, CISIM’07. 6th international conference on, IEEE, pp 245–250Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Zhenxiang Chen
    • 1
    • 2
  • Lizhi Peng
    • 1
    • 2
  • Chongzhi Gao
    • 3
  • Bo Yang
    • 1
    • 2
  • Yuehui Chen
    • 1
    • 2
  • Jin Li
    • 3
  1. 1.School of Information Science EngineeringUniversity of JinanJinanChina
  2. 2.Shandong Provincial Key Lab of Network based Intelligent ComputingJinanChina
  3. 3.Department of Computer ScienceGuangzhou UniversityGuangzhouChina

Personalised recommendations