Abstract
Recent development in smart devices has lead us to an explosion in data generation and heterogeneity, which requires new network solutions for better analyzing and understanding traffic. These solutions should be intelligent and scalable in order to handle the huge amount of data automatically. With the progress of high-performance computing (HPC), it becomes feasible easily to deploy machine learning (ML) to solve complex problems and its efficiency has been validated in several domains (e.g., healthcare or computer vision). At the same time, network slicing (NS) has drawn significant attention from both industry and academia as it is essential to address the diversity of service requirements. Therefore, the adoption of ML within NS management is an interesting issue. In this paper, we have focused on analyzing network data with the objective of defining network slices according to traffic flow behaviors. For dimensionality reduction, the feature selection has been applied to select the most relevant features (15 out of 87 features) from a real dataset of more than 3 million instances. Then, a K-means clustering is applied to better understand and distinguish behaviors of traffic. The results demonstrated a good correlation among instances in the same cluster generated by the unsupervised learning. This solution can be further integrated in a real environment using network function virtualization.
Similar content being viewed by others
References
Shen X, Gao J, Wu W, Lyu K, Li M, Zhuang W, Li X, Rao J (2020) Ai-assisted network-slicing based next-generation wireless networks. IEEE Open J Veh Technol 1:45–66
Fantacci R, Picano B (2020) When network slicing meets prospect theory: A service provider revenue maximization framework. IEEE Trans Veh Technol 69(3):3179–3189
Boutaba R, Salahuddin MA, Limam N, Ayoubi S, Shahriar N, Estrada-Solano F, Caicedo OM (2018) A comprehensive survey on machine learning for networking: evolution, applications and research opportunities. J Internet Serv Appl 9(1):1–99
Li X, Samaka M, Chan HA, Bhamare D, Gupta L, Guo C, Jain R (2017) Network slicing for 5g: Challenges and opportunities. IEEE Internet Comput 21(5):20–27
Abidi MH, Alkhalefah H, Moiduddin K, Alazab M, Mohammed MK, Ameen W, Gadekallu TR (2021) Optimal 5g network slicing using machine learning and deep learning concepts. Comput Stand Interfaces, p 103518
Kafle VP, Fukushima Y, Martinez-Julia P, Miyazawa T (2018) Consideration on automation of 5g network slicing with machine learning. In: 2018 ITU Kaleidoscope: Machine learning for a 5G future (ITU K). IEEE, pp 1–8
Mestres A, Rodriguez-Natal A, Carner J, Barlet-Ros P, Alarcón E, Solé M, Muntés-Mulero V, Meyer D, Barkai S, Hibbett MJ et al (2017) Knowledge-defined networking. ACM SIGCOMM Comput Commun Rev 47(3):2–10
L’heureux A, Grolinger K, Elyamany HF, Capretz MA (2017) Machine learning with big data: Challenges and approaches. IEEE Access 5:7776–7797
Kuranage MPJ, Piamrat K, Hamma S (2019) Network traffic classification using machine learning for software defined networks. In: International conference on machine learning for networking. Springer, pp 28–39
Le L-V, Lin B-SP, Tung L-P, Sinh D (2018) Sdn/nfv, machine learning, and big data driven network slicing for 5g. In: 2018 IEEE 5G world forum (5GWF). IEEE, pp 20–25
Nakao A, Du P (2018) Toward in-network deep machine learning for identifying mobile applications and enabling application specific network slicing. IEICE Trans Commun, 1536–1543
Le L-V, Sinh D, Lin B-SP, Tung L-P (2018) Applying big data, machine learning, and sdn/nfv to 5g traffic clustering, forecasting, and management. In: 2018 4th IEEE conference on network softwarization and workshops (NetSoft). IEEE, pp 168–176
Wang S, Wu X, Chen H, Wang Y, Li D (2014) An optimal slicing strategy for sdn based smart home network. In: 2014 International conference on smart computing. IEEE, pp 118–122
Singh SK, Salim MM, Cha J, Pan Y, Park JH (2020) Machine learning-based network sub-slicing framework in a sustainable 5g environment. Sustainability 12(15):6250
Foukas X, Patounas G, Elmokashfi A, Marina MK (2017) Network slicing in 5g: Survey and challenges. IEEE Commun Mag 55(5):94–100
Afolabi I, Taleb T, Samdanis K, Ksentini A, Flinck H (2018) Network slicing and softwarization: A survey on principles, enabling technologies, and solutions. IEEE Commun Surv Tutorials 20(3):2429–2453
Ye Q, Li J, Qu K, Zhuang W, Shen XS, Li X (2018) End-to-end quality of service in 5g networks: Examining the effectiveness of a network slicing framework. IEEE Veh Technol Mag 13(2):65–74
Usama M, Qadir J, Raza A, Arif H, Yau K-LA, Elkhatib Y, Hussain A, Al-Fuqaha A (2019) Unsupervised machine learning for networking: Techniques, applications and research challenges. IEEE Access 7:65579–65615
Tomar D, Agarwal S (2013) A survey on data mining approaches for healthcare. Int J Bio-Sci Bio-Techn 5(5):241–266
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323
Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254
Ester M, Kriegel H-P, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, no. 34, vol 96, pp 226–231
Tomar D, Agarwal S (2013) A survey on data mining approaches for healthcare. Int J Bio-Sci Bio-Technol 5(5):241–266
Ahmad A, Khan SS (2019) Survey of state-of-the-art mixed data clustering algorithms. IEEE Access 7:31883–31902
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2:224–227
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Janecek A, Gansterer W, Demel M, Ecker G (2008) On the relationship between feature selection and classification accuracy. In: New challenges for feature selection in data mining and knowledge discovery, PMLR, pp 90–105
Domingos P (2012) Afew useful things to know about machine learning. Commun ACM 55 (10):78–87
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1-3):389–422
Rojas JS, Gallón Á, Corrales JC (2018) Personalized service degradation policies on ott applications based on the consumption behavior of users. In: International conference on computational science and its applications. Springer, pp 543–557
Langley P et al (1994) Selection of relevant features in machine learning. In: Proceedings of the AAAI fall symposium on relevance, vol 184, pp 245–271
Aouedi O, Piamrat K, Parrein B (2021) Performance evaluation of feature selection and tree-based algorithms for traffic classification. In: 2021 IEEE international conference on communications (ICC) DDINS Workshop, Montreal Canada
Li R, Zhao Z, Zhou X, Ding G, Chen Y, Wang Z, Zhang H (2017) Intelligent 5g: When cellular networks meet artificial intelligence. IEEE Wirel Commun 24(5):175–183
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Aouedi, O., Piamrat, K., Hamma, S. et al. Network traffic analysis using machine learning: an unsupervised approach to understand and slice your network. Ann. Telecommun. 77, 297–309 (2022). https://doi.org/10.1007/s12243-021-00889-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12243-021-00889-1