Skip to main content
Log in

SmoteAdaNL: a learning method for network traffic classification

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Machine learning based network traffic classification is a critical technique for network management, and has attracted much attention. Recently, most of the researchers focus on achieving high flow classification accuracy (FCA). However the amount of “mice” flows is more than that of “elephant” flows in the Internet, these classifiers hence are more suitable for “mice” flows, but have low byte classification accuracy (BCA). To address this issue, the notion of byte misclassification is firstly explored. According to the exploration that most misclassified bytes belong to the minority class, a novel method of network traffic classification is proposed by combining the data re-sampling and ensemble learning algorithms. To enhance the classification accuracy of the minority class, the data re-sampling algorithm is employed to increase the number of minority class flows. The data re-sampling however will change the data distribution and degrade the generalization of a classifier. A boosting-style ensemble learning algorithm with the consideration of ensemble diversity hence is employed to improve the generalization. The experiments conducted on the real-world traffic datasets show that the proposed method achieves over 90 % BCA and 96 % FCA on average, and improves about 7.15 % BCA by comparing with the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Carela-Español V, Barlet-Ros P, Cabellos-Aparicio A, Solé-Pareta J (2010) Analysis of the impact of sampling on netflow traffic classification. Comput Netw 55(5):1083–1099. doi:10.1016/j.comnet.2010.11.002 (ISSN:1389-1286)

    Article  Google Scholar 

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16(1):321–357 (ISSN:1076-9757)

    MATH  Google Scholar 

  • Dainotti A, Pescapé A (2012) Issues and future directions in traffic classification. IEEE Netw 26(1):35–40. ISSN:0890-8044. doi:10.1109/MNET.2012.6135854

  • Dewaele G, Himura Y, Borgnat P, Fukuda K, Abry P, Michel O, Fontugne R, Cho K, Esaki H (2010) Unsupervised host behavior classification from connection patterns. Int J Netw Manag 20(5):317–337. doi:10.1002/nem.750

    Article  Google Scholar 

  • Erman J, Mahanti A, Arlitt M, Cohen I, Williamson C (2007a) Offline/realtime traffic classification using semi-supervised learning. Perform Eval 64(9–12):1194–1213. doi:10.1016/j.peva.2007.06.014 (ISSN:0166-5316)

  • Erman J, Mahanti A, Arlitt M (2007b) Byte me: a case for byte accuracy in traffic classification. In: Proceedings of the 3rd annual ACM workshop on mining network data, New York, NY, USA. ACM, pp 35–38. ISBN:978-1-59593-792-6. doi:10.1145/1269880.1269890

  • Gebert S, Pries R, Schlosser D, Heck K (2012) Internet access traffic measurement and analysis. In: Proceedings of the 4th international conference on traffic monitoring and analysis. Springer, Berlin, Heidelberg, pp 29–42. ISBN:978-3-642-28533-2. doi:10.1007/978-3-642-28534-9_3

  • Hall MA (1999) Correlation-based feature selection for machine learning. Ph.D. thesis, Waikato University

  • He HT, Che CH, Ma FT, Luo XN, Wang JM (2008) Improve flow accuracy and byte accuracy in network traffic classification. In: Proceedings of the 4th international conference on intelligent computing, vol 5227. Springer, Berlin, Heidelberg, pp 449–458. ISBN:978-3-540-85984-0. doi:10.1007/978-3-540-85984-0_54

  • Ikeda M, Kulla E, Hiyama M, Barolli L, Takizawa M (2013) Investigation of TCP and UDP multiple-flow traffic in wireless mobile ad-hoc networks. J High Speed Netw 19(2):129–145 (ISSN:0926-6801)

    Google Scholar 

  • Jin Y, Duffield N, Erman J, Haffner P, Sen S, Zhang ZL (2012) A modular machine learning system for flow-level traffic classification in large networks. ACM Trans Knowl Discov Data 6(1):1–34. doi:10.1145/2133360.2133364 (ISSN:1556–4681)

    Article  Google Scholar 

  • Law KLE, So S (2012) Qos control framework for content satisfaction in ubiquitous multimedia computing. J Ambient Intell Hum Comput 3(2):103–112. doi:10.1007/s12652-011-0077-8 (ISSN:1868-5137)

    Article  Google Scholar 

  • Lee S, Kim H, Barman D, Lee S, Kim CK, Kwon T, Choi Y (2011) Netramark: a network traffic classification benchmark. Comput Commun Rev 41(1):23–30. doi:10.1145/1925861.1925865 (ISSN:0146-4833)

    Google Scholar 

  • Liu Z, Liu Q (2012a) Balanced feature selection method for internet traffic classification. IET Netw 1(2):74–83. doi:10.1049/iet-net.2011.0049 (ISSN:2047-4954)

    Article  Google Scholar 

  • Liu Z, Liu Q (2012b) Studying cost-sensitive learning for multi-class imbalance in internet traffic classification. J China Univ Posts Telecommun 19(6):63–72. doi:10.1016/S1005-8885(11)60319-1

    Article  Google Scholar 

  • Moore AW, Zuev D (2005) Internet traffic classification using bayesian analysis techniques. In: Proceedings of the 2005 ACM SIGMETRICS international conference on measurement and modeling of computer systems, New York, NY, USA. ACM, pp 50–60. ISBN:1-59593-022-1. doi:10.1145/1064212.1064220

  • Moore AW, Zuev D, Crogan M (2005) Discriminators for use in flow-based classification. Department of Computer Science, Queen Mary, University of London, RR-05-13 (ISSN 1470–5559)

  • Palmieri F, Fiore U (2008) A nonlinear, recurrence-based approach to traffic classification. Comput Netw 53(6):761–773. doi:10.1016/j.comnet.2008.12.015 (ISSN 1389-1286)

    Article  Google Scholar 

  • Palmieri F, Fiore U, Castiglione A, De Santis A (2013) On the detection of card-sharing traffic through wavelet analysis and support vector machines. Appl Soft Comput 13(1):615–627. doi:10.1016/j.asoc.2012.08.045 (ISSN:1568-4946)

    Article  Google Scholar 

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (ISBN:1-55860-238-0)

    Google Scholar 

  • Soysal M, Schmidt EG (2010) Machine learning algorithms for accurate flow-based network traffic classification: evaluation and comparison. Perform Eval 67:451–467. doi:10.1016/j.peva.2010.01.001

    Article  Google Scholar 

  • Tao M, Yuan HQ, Dong SB, Yu HW (2012) Initiative movement prediction assisted adaptive handover trigger scheme in fast MIPv6. Comput Commun 35(10):1272–1282. doi:10.1016/j.comcom.2012.03.015 (ISSN:0140-3664)

    Article  Google Scholar 

  • Tao M, Yuan HQ, Wei WH (2014) Active overload prevention based adaptive map selection in hmipv6 networks. Wirel Netw 20(2):197–208. doi:10.1007/s11276-013-0603-z (ISSN:1022–0038)

    Article  Google Scholar 

  • Wang S, Chen HH, Yao X (2010) Negative correlation learning for classification ensembles. In: Proceedings of international joint conference on neural networks, pp 2893–2900. doi:10.1109/IJCNN.2010.5596702

  • Wang RY, Liu Z, Zhang L (2014) Method of data cleaning for network traffic classification. J China Univ Posts Telecommun 21(3):35–45. doi:10.1016/S1005-8885(14)60299-5

    Article  Google Scholar 

  • Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (ISBN:0-12-088407-0)

    Google Scholar 

  • Ye W, Cho K (2014) Hybrid p2p traffic classification with heuristic rules and machine learning. Soft Comput 18(9):1815–1827. doi:10.1007/s00500-014-1253-5 (ISSN:1432-7643)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by National Natural Science Fund, China (Grant No. 61300198), Guangdong Province Natural Science Foundation (No. S2013040016582). Guangdong Higher School Scientific Innovation Project (Nos. 2013KJCX0177 and 2014KTSCX188), Fundamental Research Funds for the Central Universities (SCUT 2014ZB0029) and China Postdoctoral Science Foundation (No. 2014M552199).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruoyu Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Z., Wang, R. & Tao, M. SmoteAdaNL: a learning method for network traffic classification. J Ambient Intell Human Comput 7, 121–130 (2016). https://doi.org/10.1007/s12652-015-0310-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-015-0310-y

Keywords

Navigation