Skip to main content
Log in

Network traffic analysis using machine learning: an unsupervised approach to understand and slice your network

  • Published:
Annals of Telecommunications Aims and scope Submit manuscript

Abstract

Recent development in smart devices has lead us to an explosion in data generation and heterogeneity, which requires new network solutions for better analyzing and understanding traffic. These solutions should be intelligent and scalable in order to handle the huge amount of data automatically. With the progress of high-performance computing (HPC), it becomes feasible easily to deploy machine learning (ML) to solve complex problems and its efficiency has been validated in several domains (e.g., healthcare or computer vision). At the same time, network slicing (NS) has drawn significant attention from both industry and academia as it is essential to address the diversity of service requirements. Therefore, the adoption of ML within NS management is an interesting issue. In this paper, we have focused on analyzing network data with the objective of defining network slices according to traffic flow behaviors. For dimensionality reduction, the feature selection has been applied to select the most relevant features (15 out of 87 features) from a real dataset of more than 3 million instances. Then, a K-means clustering is applied to better understand and distinguish behaviors of traffic. The results demonstrated a good correlation among instances in the same cluster generated by the unsupervised learning. This solution can be further integrated in a real environment using network function virtualization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://www.kaggle.com/jsrojas/ip-network-traffic-flows-labeled-with-87-apps

References

  1. Shen X, Gao J, Wu W, Lyu K, Li M, Zhuang W, Li X, Rao J (2020) Ai-assisted network-slicing based next-generation wireless networks. IEEE Open J Veh Technol 1:45–66

    Article  Google Scholar 

  2. Fantacci R, Picano B (2020) When network slicing meets prospect theory: A service provider revenue maximization framework. IEEE Trans Veh Technol 69(3):3179–3189

    Article  Google Scholar 

  3. Boutaba R, Salahuddin MA, Limam N, Ayoubi S, Shahriar N, Estrada-Solano F, Caicedo OM (2018) A comprehensive survey on machine learning for networking: evolution, applications and research opportunities. J Internet Serv Appl 9(1):1–99

    Article  Google Scholar 

  4. Li X, Samaka M, Chan HA, Bhamare D, Gupta L, Guo C, Jain R (2017) Network slicing for 5g: Challenges and opportunities. IEEE Internet Comput 21(5):20–27

    Article  Google Scholar 

  5. Abidi MH, Alkhalefah H, Moiduddin K, Alazab M, Mohammed MK, Ameen W, Gadekallu TR (2021) Optimal 5g network slicing using machine learning and deep learning concepts. Comput Stand Interfaces, p 103518

  6. Kafle VP, Fukushima Y, Martinez-Julia P, Miyazawa T (2018) Consideration on automation of 5g network slicing with machine learning. In: 2018 ITU Kaleidoscope: Machine learning for a 5G future (ITU K). IEEE, pp 1–8

  7. Mestres A, Rodriguez-Natal A, Carner J, Barlet-Ros P, Alarcón E, Solé M, Muntés-Mulero V, Meyer D, Barkai S, Hibbett MJ et al (2017) Knowledge-defined networking. ACM SIGCOMM Comput Commun Rev 47(3):2–10

    Article  Google Scholar 

  8. L’heureux A, Grolinger K, Elyamany HF, Capretz MA (2017) Machine learning with big data: Challenges and approaches. IEEE Access 5:7776–7797

    Article  Google Scholar 

  9. Kuranage MPJ, Piamrat K, Hamma S (2019) Network traffic classification using machine learning for software defined networks. In: International conference on machine learning for networking. Springer, pp 28–39

  10. Le L-V, Lin B-SP, Tung L-P, Sinh D (2018) Sdn/nfv, machine learning, and big data driven network slicing for 5g. In: 2018 IEEE 5G world forum (5GWF). IEEE, pp 20–25

  11. Nakao A, Du P (2018) Toward in-network deep machine learning for identifying mobile applications and enabling application specific network slicing. IEICE Trans Commun, 1536–1543

  12. Le L-V, Sinh D, Lin B-SP, Tung L-P (2018) Applying big data, machine learning, and sdn/nfv to 5g traffic clustering, forecasting, and management. In: 2018 4th IEEE conference on network softwarization and workshops (NetSoft). IEEE, pp 168–176

  13. Wang S, Wu X, Chen H, Wang Y, Li D (2014) An optimal slicing strategy for sdn based smart home network. In: 2014 International conference on smart computing. IEEE, pp 118–122

  14. Singh SK, Salim MM, Cha J, Pan Y, Park JH (2020) Machine learning-based network sub-slicing framework in a sustainable 5g environment. Sustainability 12(15):6250

    Article  Google Scholar 

  15. Foukas X, Patounas G, Elmokashfi A, Marina MK (2017) Network slicing in 5g: Survey and challenges. IEEE Commun Mag 55(5):94–100

    Article  Google Scholar 

  16. Afolabi I, Taleb T, Samdanis K, Ksentini A, Flinck H (2018) Network slicing and softwarization: A survey on principles, enabling technologies, and solutions. IEEE Commun Surv Tutorials 20(3):2429–2453

    Article  Google Scholar 

  17. Ye Q, Li J, Qu K, Zhuang W, Shen XS, Li X (2018) End-to-end quality of service in 5g networks: Examining the effectiveness of a network slicing framework. IEEE Veh Technol Mag 13(2):65–74

    Article  Google Scholar 

  18. Usama M, Qadir J, Raza A, Arif H, Yau K-LA, Elkhatib Y, Hussain A, Al-Fuqaha A (2019) Unsupervised machine learning for networking: Techniques, applications and research challenges. IEEE Access 7:65579–65615

    Article  Google Scholar 

  19. Tomar D, Agarwal S (2013) A survey on data mining approaches for healthcare. Int J Bio-Sci Bio-Techn 5(5):241–266

    Article  Google Scholar 

  20. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323

    Article  Google Scholar 

  21. Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254

    Article  Google Scholar 

  22. Ester M, Kriegel H-P, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, no. 34, vol 96, pp 226–231

  23. Tomar D, Agarwal S (2013) A survey on data mining approaches for healthcare. Int J Bio-Sci Bio-Technol 5(5):241–266

    Article  Google Scholar 

  24. Ahmad A, Khan SS (2019) Survey of state-of-the-art mixed data clustering algorithms. IEEE Access 7:31883–31902

    Article  Google Scholar 

  25. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2:224–227

    Article  Google Scholar 

  26. Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  Google Scholar 

  27. Janecek A, Gansterer W, Demel M, Ecker G (2008) On the relationship between feature selection and classification accuracy. In: New challenges for feature selection in data mining and knowledge discovery, PMLR, pp 90–105

  28. Domingos P (2012) Afew useful things to know about machine learning. Commun ACM 55 (10):78–87

    Article  Google Scholar 

  29. Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1-3):389–422

    Article  Google Scholar 

  30. Rojas JS, Gallón Á, Corrales JC (2018) Personalized service degradation policies on ott applications based on the consumption behavior of users. In: International conference on computational science and its applications. Springer, pp 543–557

  31. Langley P et al (1994) Selection of relevant features in machine learning. In: Proceedings of the AAAI fall symposium on relevance, vol 184, pp 245–271

  32. Aouedi O, Piamrat K, Parrein B (2021) Performance evaluation of feature selection and tree-based algorithms for traffic classification. In: 2021 IEEE international conference on communications (ICC) DDINS Workshop, Montreal Canada

  33. Li R, Zhao Z, Zhou X, Ding G, Chen Y, Wang Z, Zhang H (2017) Intelligent 5g: When cellular networks meet artificial intelligence. IEEE Wirel Commun 24(5):175–183

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kandaraj Piamrat.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aouedi, O., Piamrat, K., Hamma, S. et al. Network traffic analysis using machine learning: an unsupervised approach to understand and slice your network. Ann. Telecommun. 77, 297–309 (2022). https://doi.org/10.1007/s12243-021-00889-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12243-021-00889-1

Keywords

Navigation