An Empirical Investigation of Filter Attribute Selection Techniques for High-Speed Network Traffic Flow Classification

Yang, Jie; Ma, Jing; Cheng, Gang; Wang, Yixuan; Yuan, Lun; Dong, Chao

doi:10.1007/s11277-012-0735-y

An Empirical Investigation of Filter Attribute Selection Techniques for High-Speed Network Traffic Flow Classification

Published: 04 July 2012

Volume 66, pages 541–558, (2012)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

Jie Yang^1,3,
Jing Ma¹,
Gang Cheng²,
Yixuan Wang¹,
Lun Yuan¹ &
…
Chao Dong¹

191 Accesses
5 Citations
Explore all metrics

Abstract

Attribute selection is an important methodology for data mining problems. Removing irrelevant and redundant attributes from original data set can greatly simplify building classifier models. In this paper, we consider applying attribute selection techniques to network traffic flow classification and conduct experiments using the actual network data collected from the Internet of China. The results show that building with an appropriate attribute selection method can simplify the network traffic classifier while achieving satisfactory classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature Selection of Network Flow Based on Machine Learning

A Four-Stage Hybrid Feature Subset Selection Approach for Network Traffic Classification Based on Full Coverage

Clustering and Principal Feature Selection Impact for Internet Traffic Classification Using K-NN

References

Khoshgoftaar, T. M., Golawala, M., & Van Hulse, J. (2007). An empirical study of learning from imbalanced data using random forest. In 19th IEEE international conference on tools with artificial intelligence (Vol. 2) (ICTAI 2007), Oct 29–31, Paris, France.
Gao, K., Khoshgoftaar, T. M., & Wang, H. (2009). An empirical investigation of filter attribute selection techniques for software quality classification. IEEE IRI 2009, July 10–12, Las Vegas, Nevada, USA.
Saian, R., & Ku-Mahamud, K. R. (2010). Comparison of attribute selection methods for web texts categorization. Computer and network technology (ICCNT), 2010 second international conference on, April 23–25.
Quinlan R. J. (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc, San Mateo, CA
Google Scholar
Liu H., Yu L. (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering. 17(4): 491–502
Article Google Scholar
Doak, J. (1992). An evaluation of feature selection methods and their application to computer security. Technical report, Davis CA: University of California, Department of Computer Science.
Narendra P. M., Fukunaga K. (1977) A branch and bound algorithm for feature subset selection. IEEE Transactions on Computer C-26(9): 917–922
Article Google Scholar
Liu, H., & Motoda, H. (1998). Feature extraction, construction and selection: A data mining perspective. Boston, MA: Kluwer Academic.
Dash, M., Choi, K., Scheuermann, P., & Liu, H. (2002). Feature selection for clustering—a filter solution. In Proceedings of the second international conference on data mining (pp. 115–122).
Brassard G., Bratley P. (1996) Fundamentals of algorithms. Prentice Hall, New Jersey
Google Scholar
Kohavi R., John G. H. (1997) Wrappers for feature subset selection. Artificial Intelligence 97(1–2): 273–324
Article MATH Google Scholar
Kohavi R., Rothleder N. J., Simoudis E. (2002) Emerging trends in business analytics. Communications of the Association for Computing Machinery 45(8): 45–48
Article Google Scholar
Koller, D., & Sahami, M. (1996). Toward optimal feature selection. In Proceedings of the thirteenth international conference on machine learning (pp. 284–292).
Kononenko, I. (1994). Estimating attributes: Analysis and extension of RELIEF. In Proceedings of the sixth European conference on machine learning(pp. 171–182).
Langley, P. (1994). Selection of relevant features in machine learning. In proceedings of the AAAI fall symposium on relevance (pp. 140–144).
Han J., Kamber M. (2001) Data mining: concepts and techniques. Morgan Kaufman, Los Altos, CA, p 291
Google Scholar
Quinlan J. R. (1986) Induction of decision trees. Machine Learning 1: 81–106
Google Scholar
Buntine W. (1993) Learning classification trees. Statistics and Computing 2(2): 63–73
Article Google Scholar
Li, B., Xie, S., Qu, Y., Keung, G. Y., Lin, C., Liu, J., & Zhang, X. (2008). Inside the new coolstreaming: Principles, measurements and performance implications. In Proceedings of INFOCOM 2008 (pp. 1031–1039), Phoenix, AZ, USA, April 13–18.
Witten, I. H., & Frank, E. (2006). Data mining: Practical machine learning tools and techniques, 2nd edn (pp. 171, 365, 393). Singapore: Elsevier Pte Ltd.
Chawla, N. V. (2002). SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 321–357 (Submitted 09/01, published 06/02).
Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Key Laboratory of Network System Architecture and Convergence, School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Jie Yang, Jing Ma, Yixuan Wang, Lun Yuan & Chao Dong
Microsoft Corporation, Redmond, WA, USA
Gang Cheng
10 XiTuCheng Road, HaiDian District, PO Box 151, Beijing, 100876, China
Jie Yang

Authors

Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Ma
View author publications
You can also search for this author in PubMed Google Scholar
Gang Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Yixuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lun Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Chao Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, J., Ma, J., Cheng, G. et al. An Empirical Investigation of Filter Attribute Selection Techniques for High-Speed Network Traffic Flow Classification. Wireless Pers Commun 66, 541–558 (2012). https://doi.org/10.1007/s11277-012-0735-y

Download citation

Published: 04 July 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s11277-012-0735-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Empirical Investigation of Filter Attribute Selection Techniques for High-Speed Network Traffic Flow Classification

Abstract

Access this article

Similar content being viewed by others

Feature Selection of Network Flow Based on Machine Learning

A Four-Stage Hybrid Feature Subset Selection Approach for Network Traffic Classification Based on Full Coverage

Clustering and Principal Feature Selection Impact for Internet Traffic Classification Using K-NN

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An Empirical Investigation of Filter Attribute Selection Techniques for High-Speed Network Traffic Flow Classification

Abstract

Access this article

Similar content being viewed by others

Feature Selection of Network Flow Based on Machine Learning

A Four-Stage Hybrid Feature Subset Selection Approach for Network Traffic Classification Based on Full Coverage

Clustering and Principal Feature Selection Impact for Internet Traffic Classification Using K-NN

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation