An SVM-based machine learning method for accurate internet traffic classification
Purchase on Springer.com
$39.95 / €34.95 / £29.95*
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.
Accurate and timely traffic classification is critical in network security monitoring and traffic engineering. Traditional methods based on port numbers and protocols have proven to be ineffective in terms of dynamic port allocation and packet encapsulation. The signature matching methods, on the other hand, require a known signature set and processing of packet payload, can only handle the signatures of a limited number of IP packets in real-time. A machine learning method based on SVM (supporting vector machine) is proposed in this paper for accurate Internet traffic classification. The method classifies the Internet traffic into broad application categories according to the network flow parameters obtained from the packet headers. An optimized feature set is obtained via multiple classifier selection methods. Experimental results using traffic from campus backbone show that an accuracy of 99.42% is achieved with the regular biased training and testing samples. An accuracy of 97.17% is achieved when un-biased training and testing samples are used with the same feature set. Furthermore, as all the feature parameters are computable from the packet headers, the proposed method is also applicable to encrypted network traffic.
- Bazi, Y., & Melgani, F. (2006). Toward an optimal SVM classification system for hyperspectral remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 44(11), 3374–3385. CrossRef
- Beheshti, H., Hultman, M., Jung, M., Opoku, R., & Salehi-Sangari, E. (2007). Electronic supply chain management applications by Swedish SMEs. Enterprise Information Systems, 1(2), 255–268. CrossRef
- Bellotti, T., & Crook, J. (2008). Support vector machines for credit scoring and discovery of significant features. Expert Systems with Applications, to appear.
- Bernaille, L., Teixeira, R., Akodkenou, I., Soule, A., & Salamatian, K. (2006). Traffic classification on the fly. Computer Communication Review, 36(2), 23–26. CrossRef
- Burges, C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2, 121–167. CrossRef
- Duan, L., Xu, L., Guo, F., Lee, J., & Yan, B. (2007). A local-density based spatial clustering algorithm with noise. Information Systems, 32(7), 978–986. CrossRef
- Duan, L., Xu, L., Liu, Y., & Lee, J. (2008). Cluster-based outlier detection. Annals of Operations Research, to appear.
- Early, J., Brodley, C., & Rosenberg, C. (2003). Behavioral authentication of server flows. Proceedings of the 19th Annual Computer Security Applications Conference, pp. 46–55.
- Feng, S., Li, H., & Xu, L. (2001). Knowledge-based systems in China. Knowledge-Based Systems, 14, iii–iv. CrossRef
- Guo, J. (2007). Business-to-business electronic market place selection. Enterprise Information Systems, 1(4), 383–419. CrossRef
- Haffner, P., Sen, S., Spatscheck, O., & Wang, D. (2005). ACAS: Automated construction of application signatures. Proceeding of ACM SIGCOMM 2005 Workshops: Conference on Computer Communications, 197–202.
- Huang, C., Liao, H., & Chen, M. (2008). Prediction model building and feature selection with support vector machines in breast cancer diagnosis. Expert Systems with Applications, 34, 578–587. CrossRef
- Kohavi, R. (1995). A Study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1137–1143.
- Lakhina, A., Crovella, M., & Diot, C. (2004). Characterization of network-wide anomalies in traffic flows. Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference, 201–206.
- Li, L., Valerdi, R., & Warfield, J. (2008). Advances in enterprise information systems. Information Systems Frontiers, to appear.
- Li, L., Warfield, J., Guo, S., Guo, W., & Qi, J. (2007a). Advances in intelligent information processing. Information Systems, 32(7), 941–943. CrossRef
- Li, H., & Xu, L. (2001). Feature space theory-a mathematical foundation for data mining. Knowledge-Based Systems, 14(5–6), 253–257. CrossRef
- Li, W., Zheng, W., & Guan, X. (2007b). Application controlled caching for web servers. Enterprise Information Systems, 1(2), 161–175. CrossRef
- Liu, R., Wang, Y., Baba, T., Masumoto, D., & Nagata, S. (2008). SVM-based active feedback in image retrieval using clustering and unlabeled data. Pattern Recognition, 41, 2645–2655. CrossRef
- Luo, J., Xu, L., Jamont, J. P., Zeng, L., & Shi, Z. (2007). A flood decision support system on agent grid: method and implementation. Enterprise Information Systems, 1(1), 49–68. CrossRef
- Moore, A., & Zuev, D. (2005a). Internet traffic classification using Bayesian analysis techniques. Performance Evaluation Review, 33, 50–60. CrossRef
- Moore, A., & Zuev, D. (2005b). Discriminators for use in flow-based classification. Cambridge: Technical Report, Intel Research.
- Roughan, M., Sen, S., Spatscheck, O., & Duffield, N. (2004). Class-of-service mapping for QoS: A statistical signature-based approach to IP traffic classification. Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference, 135–148.
- Sen, S., Spatscheck, O., & Wang, D. (2004). Accurate, scalable in-network identification of P2P traffic using application signatures. Thirteenth International World Wide Web Conference Proceedings, 512–521.
- Shi, Z., Huang, Y., He, Q., Xu, L., Liu, S., Qin, L., et al. (2007). MSMiner-a developing platform for OLAP. Decision Support Systems, 42(4), 2016–2028. CrossRef
- Shi, S., Xu, L., & Liu, B. (1996). Application of artificial neural networks to the nonlinear combined forecasts. Expert Systems, 13(3), 195–201. CrossRef
- Shi, S., Xu, L., & Liu, B. (1999). Improving the accuracy of nonlinear combined forecasting using neural networks. Expert Systems With Applications, 16(1), 49–54. CrossRef
- Shon, T., & Moon, J. (2007). A hybrid machine learning approach to network anomaly detection. Information Sciences, 177, 3799–3821. CrossRef
- Sourceforge Application Layer Packet Classifier for Linux (2006). Application Layer Packet Classifier for Linux. Retrieved in 2006, from http://l7-filter.sourceforge.net.
- Vigna, G., Robertson, W., & Balzarotti, D. (2004). Testing network-based intrusion detection signatures using mutant exploits. Proceedings of the 11th ACM Conference on Computer and Communications Security, 21–30.
- Wang, S., & Archer, N. (2007). Electronic marketplace definition and classification: literature review and clarification. Enterprise Information Systems, 1(1), 89–112. CrossRef
- Xu, L. (1999). Artificial intelligence applications in China. Expert Systems with Applications, 16(1), 1–2. CrossRef
- Xu, L. (2006). Advances in intelligent information processing. Expert Systems, 23(5), 249–250. CrossRef
- Yan, Z., Wang, Z., & Xie, H. (2008). The application of mutual information-based feature selection and fuzzy LS-SVM-based classifier in motion classification. Computer Methods and Programs in Biomedicine, 90, 275–284. CrossRef
- An SVM-based machine learning method for accurate internet traffic classification
Information Systems Frontiers
Volume 12, Issue 2 , pp 149-156
- Cover Date
- Print ISSN
- Online ISSN
- Springer US
- Additional Links
- Internet traffic
- Network traffic classification
- Machine learning
- Feature selection
- Author Affiliations
- 1. Center for Intelligent and Networked Systems, TNLIST Lab, Tsinghua University, Beijing, 100084, China
- 2. MOE KLINNS Lab and SKLMS Lab, Xi’an Jiaotong University, Xi’an, 710049, China
- 3. College of Economics and Management, Beijing Jiaotong University, Beijing, 100044, China
- 4. Department of Information Technology and Decision Science, Old Dominion University, Norfolk, VA, 23529, USA