Abstract
At present, classification models are widely used in network traffic anomaly detection. Researchers have found that the imbalance of network traffic data sets affects the result of anomaly detection in convergence network. For imbalanced data, researchers have proposed many solutions for dichotomous datasets, and there are often multi-class datasets in network traffic datasets, that is, there are often many different types of anomalous data in network traffic data, and the frequency of occurrence between different anomalous data varies greatly. To address the problems of complex data information and different important features of each class that often exist in such multi-class unbalanced network traffic data, this paper proposes a new oversampling technique, which firstly performs sample selection based on the information entropy of minority class samples, followed by the decomposition of eigenvalues of the data using principal component analysis, making the features uncorrelated with each other, and then synthesizes the data. Finally, we validate the effectiveness of our method on handling multi-class network traffic datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Abdi, L., Hashemi, S.: To combat multi-class imbalanced problems by means of over-sampling and boosting techniques. Soft Comput. 19(12), 3369–3385 (2015)
Agrawal, A., Viktor, H.L., Paquet, E.: SCUT: multi-class imbalanced data classification using SMOTE and cluster-based undersampling. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), Lisbon, pp. 226–234. IEEE (2015)
Kumari, A., Thakar, U.: Hellinger distance based oversampling method to solve multi-class imbalance problem. In: 2017 7th International Conference on Communication Systems and Network Technologies (CSNT), Nagpur, pp. 137–141. IEEE (2017)
Yang, X., Kuang, Q., Zhang, W., Zhang, G.: AMDO: an over-sampling technique for multi-class imbalanced problems. IEEE Trans. Knowl. Data Eng. 30(9), 1672–1685 (2018)
Janicka, M., Lango, M., Stefanowski, J.: Using information on class interrelations to improve classification of multiclass imbalanced data: a new resampling algorithm. Int. J. Appl. Math. Comput. Sci. 29(4), 769–781 (2019)
Li, L., He, H., Li, J.: Entropy-based sampling approaches for multi-class imbalanced problems. IEEE Trans. Knowl. Data Eng. 32(11), 2159–2170 (2020)
Acknowledgment
This work is Supported by National Key R&D Program of China (2019YFB2103202, 2019YFB2103200).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhao, Q., Yang, Y., Zhao, L., Wang, Z., Cui, D., Gao, Z. (2022). Unbalanced Data Oversampling Method for Traffic Multi-classification in Convergence Network. In: Liu, Q., Liu, X., Chen, B., Zhang, Y., Peng, J. (eds) Proceedings of the 11th International Conference on Computer Engineering and Networks. Lecture Notes in Electrical Engineering, vol 808. Springer, Singapore. https://doi.org/10.1007/978-981-16-6554-7_171
Download citation
DOI: https://doi.org/10.1007/978-981-16-6554-7_171
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6553-0
Online ISBN: 978-981-16-6554-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)