Abstract
P2P and multimedia similar applications are seemed as primary bandwidth consume network behaviors. Accurate network traffic behavior identification supports numerous network activities from network management, monitoring and Quality-of-Service(QoS), to forecast and application-specific investigations. Accuracy and performance are the two most important metrics for traffic identification especially for online implementation. In this paper, the optimization of feature selection to traffic identification is demonstrated in two traces which are captured from different time and location. Moreover, this optimization to traffic identification toward various applications are compared and analyzed in online and offline status with C4.5 decision tree algorithm. Our research demonstrated that the optimal features for traffic identification are mainly sensitive to application, time and location. Identifying for the same application behavior on different network location are sensitive to different features. Experiment result shows that the selected optimal feature subset can greatly improve the performance for both online and offline identification. Furthermore, it can improve the online traffic identification implementability in real network condition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Nguyen, T.T.T., Armitage, G.: A survey of techniques for internet traffic classification using machine learning. Communications Surveys and Tutorials 10, 56–76 (2008)
Moore, A.W., Zuev, D.: Internet traffic classification using bayesian analysis techniques. ACM SIGMETRICS Performance Evaluation Review 33, 50–60 (2005)
Nguyen, T.T.T., Armitage, G.: Training on multiple sub-flows to optimize the use of machine learning classifiers in real-world ip networks. In: 31st Local Computer Networks, pp. 369–376. IEEE Press, New York (2006)
McGregor, A., Hall, M., Lorier, P., Brunskill, J.: Flow Clustering Using Machine Learning Techniques. In: Barakat, C., Pratt, I. (eds.) PAM 2004. LNCS, vol. 3015, pp. 205–214. Springer, Heidelberg (2004)
Zander, S., Nguyen, T.: ArmitageG.: Automated Traffic Classification and Application Identification using Machine Learning. In: 30th Anniversary of the IEEE Conference on Local Computer Networks 2005, pp. 250–257. IEEE Press, New York (2005)
Erman, J., Mahanti, A., Arlitt, M., Cohen, I., Williamson, C.: Offline/realtime traffic classification using semi-supervised learning. Performance Evaluation 64, 1194–1213 (2007)
Zhao, J.J., Huang, X.H., Sun, Q., Ma, Y.: Real-time feature selection in traffic classification. The Journal of China Universities of Posts and Telecommunications 15, 68–72 (2008)
Zhang, H., Lu, G., Qassrawi, M.T., Zhang, Y., Yu, X.: Feature selection for optimizing traffic classification. Computer Communications 35, 1457–1471 (2012)
Callado, A., Kamienski, C., Szab, G., Gero, B., Kelner, J., Fernandes, S., Sadok, D.: A survey on internet traffic identification. IEEE Communications Surveys and Tutorials 11, 37–52 (2009)
Moore, A., Zuev, D., Crogan, M.: Discriminators for use in flow-based classification. Queen Mary and Westfield College, Department of Computer Science (2005)
Tsang, C.H., Kwong, S., Wang, H.: Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection. Pattern Recognition 40, 2373–2391 (2007)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17, 491–502 (2005)
Lei, D., Xiaochun, Y., Jun, X.: Optimizing traffic classification using hybrid feature selection. In: The Ninth International Conference on Web-Age Information Management, pp. 520–525. IEEE Press, New York (2008)
Zhang, Y., Li, S., Wang, T., Zhang, Z.: Divergence-based feature selection for separate classes. Neurocomputing 101, 32–42 (2013)
Liu, Z., Liu, Q.: Studying cost-sensitive learning for multi-class imbalance in Internet traffic classification. The Journal of China Universities of Posts and Telecommunications 19, 63–72 (2012)
Zhang, G., Xie, G., Yang, J., Min, Y., Zhou, Z., Duan, X.: Accurate online traffic classification with multi-phases identification methodology. In: 5th IEEE Consumer Communications and Networking Conference, pp. 141–146. IEEE Press, New York (2008)
Che, X., Ip, B.: Packet-level traffic analysis of online games from the genre characteristics perspective. Journal of Network and Computer Applications 35, 240–252 (2012)
Tian, X., Sun, Q., Huang, X., Ma, Y.: A dynamic online traffic classification methodology based on data stream mining. In: 2009 WRI World Congress on Computer Science and Information Engineering, pp. 298–302. IEEE Press, New York (2009)
Lizhi, P., Hongli, Z., Bo, Y., Yuehui, C., Tong, W.: Traffic Labeller: Collecting Internet traffic samples with accurate application information. Communications, China 11, 69–78 (2014)
Micheel, J., Graham, I., Brownlee, N.: The Auckland data set: an access link observed. In: Proceedings of the 14th ITC Specialists Seminar on Access Networks and Systems, pp. 19–30 (2001)
Witten, I.H., Frank, E., Kaufmann, E.M.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann series in data management systems, pp. 1046–1698 (2005) ISSN 1046-1698
Zhao, S., Yu, X., Chen, Z., Jing, S., Peng, L., Liu, K.: A Novel Online Traffic Classification Method Based on Few Packets. In: 8th International Conference on Wireless Communications, Networking and Mobile Computing (WiCOM), pp. 1–4. IEEE Press, New York (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, Z., Peng, L., Zhao, S., Zhang, L., Jing, S. (2014). Feature Selection Toward Optimizing Internet Traffic Behavior Identification. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8631. Springer, Cham. https://doi.org/10.1007/978-3-319-11194-0_56
Download citation
DOI: https://doi.org/10.1007/978-3-319-11194-0_56
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11193-3
Online ISBN: 978-3-319-11194-0
eBook Packages: Computer ScienceComputer Science (R0)