Skip to main content

A Four-Stage Hybrid Feature Subset Selection Approach for Network Traffic Classification Based on Full Coverage

  • Conference paper
  • First Online:
Security, Privacy, and Anonymity in Computation, Communication, and Storage (SpaCCS 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11342))

  • 1492 Accesses

Abstract

There is significant interest in network management and security to classify traffic flows. As the essential step for machine learning based traffic classification, feature subset selection is often used to realize dimension reduction and redundant information decrease. A four-stage hybrid feature subset selection method is proposed to improve the classification performance of hybrid methods at low evaluation consumption. The proposed algorithm is designed to dispose features in the level of block and evaluate every feature even the remaining ones which cannot provide much information by themselves to use the interactions among all of them. Additionally, a wrapper-based selection is designed in the last stage to further remove the redundant features. The performances are examined by two groups of experiments. Our theoretical analysis and experimental observations reveal that the proposed method selects feature subset with improved classification performance on every index while depleting fewer evaluations. Moreover, the evaluation consumption can keep at a low and stable level with different size of block.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Khayari, R.E.A., Sadre, R,, Haverkort, B.R.: A validation of the pseudo self-similar traffic model. In: International Conference on Dependable Systems and Networks, pp. 727–734. IEEE Computer Society (2002)

    Google Scholar 

  2. Liu, Z., Wang, R., Tao, M., et al.: A class-oriented feature selection approach for multi-class imbalanced network traffic datasets based on local and global metrics fusion. Neurocomputing 168(C), 365–381 (2015)

    Article  Google Scholar 

  3. Nie, F., Huang, H., Cai, X., et al.: Efficient and robust feature selection via joint ℓ2,1-norms minimization. In: International Conference on Neural Information Processing Systems, pp. 1813–1821. Curran Associates Inc (2010)

    Google Scholar 

  4. Nie, F., Xu, D., Tsang, I.W., et al.: Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction. IEEE Trans. Image Process. 19(7), 1921–1932 (2010)

    Article  MathSciNet  Google Scholar 

  5. Wang, R., Nie, F., Hong, R., et al.: Fast and orthogonal locality preserving projections for dimensionality reduction. IEEE Trans. Image Process. PP(99), 1-1 (2017)

    MathSciNet  Google Scholar 

  6. Xie, J., Wang, C.: Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. Int. J. 38(5), 5809–5815 (2011)

    Article  Google Scholar 

  7. Peng, Y., Wu, Z., Jiang, J.: A novel feature selection approach for biomedical data classification. J. Biomed. Inform. 43(1), 15–23 (2010)

    Article  Google Scholar 

  8. Zhang, L.X,, Wang, J.X., Zhao, Y.N., et al.: A novel hybrid feature selection algorithm: using ReliefF estimation for GA-wrapper search. In: International Conference on Machine Learning and Cybernetics, vol. 1, pp. 380–384. IEEE (2004)

    Google Scholar 

  9. Bonilla-Huerta, E., Duval, B., Hernández, J.C.H., Hao, J.-K., Morales-Caporal, R.: Hybrid filter-wrapper with a specialized random multi-parent crossover operator for gene selection and classification problems. In: Huang, D.-S., Gan, Y., Premaratne, P., Han, K. (eds.) ICIC 2011. LNCS, vol. 6840, pp. 453–461. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24553-4_60

    Chapter  Google Scholar 

  10. Guyon, I., Elisseeff, A., et al.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(6), 1157–1182 (2003)

    MATH  Google Scholar 

  11. Vieira, S.M., Sousa, J.M.C., Kaymak, U.: Fuzzy criteria for feature selection. Fuzzy Sets Syst. 189(1), 1–18 (2012)

    Article  MathSciNet  Google Scholar 

  12. Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann Publishers Inc (2000)

    Google Scholar 

  13. Hsu, H.H., Hsieh, C.W., Lu, M.D.: Hybrid feature selection by combining filters and wrappers. Expert Syst. Appl. 38(7), 8144–8150 (2011)

    Article  Google Scholar 

  14. Bermejo, P., Ossa, L.D.L., Gámez, J.A., et al.: Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking. Knowl. Based Syst. 25(1), 35–44 (2012)

    Article  Google Scholar 

  15. Wald, R., Khoshgoftaar, T.M., Napolitano, A.: Stability of filter- and wrapper-based feature subset selection. In: IEEE International Conference on TOOLS with Artificial Intelligence, pp. 374–380. IEEE (2014)

    Google Scholar 

  16. Guyon, I., Gunn, S., Nikravesh, M., et al. (eds.): Feature Extraction: Foundations and Applications. Studies in Fuzziness and Soft Computing. Springer, New York (2005). https://doi.org/10.1007/978-3-540-35488-8

    Book  Google Scholar 

  17. Shen, H., Wang, B.: An effective method for synthesizing multiple-pattern linear arrays with a reduced number of antenna elements. IEEE Trans. Antennas Propag. PP(99), 1 (2017)

    Google Scholar 

  18. Shen, J., Xia, J., Zhang, X., et al.: Sliding block based hybrid feature subset selection in network traffic. IEEE Access 5(99), 18179–18186 (2017)

    Article  Google Scholar 

  19. Shen, J., Xia, J., Dong, S., et al.: Universal feature extraction for traffic identification of the target category. PLoS ONE 11(11), e0165993 (2016)

    Article  Google Scholar 

  20. Fialho, A.S., et al.: Predicting outcomes of septic shock patients using feature selection based on soft computing techniques. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. CCIS, vol. 81, pp. 65–74. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14058-7_7

    Chapter  Google Scholar 

  21. Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recogn. 39(12), 2383–2392 (2006)

    Article  Google Scholar 

  22. Bermejo, P., Gamez, J.A., Puerta, J.M.: Incremental Wrapper-based subset selection with replacement: an advantageous alternative to sequential forward selection. In: IEEE Symposium on Computational Intelligence and Data Mining, 2009 (CIDM 2009), pp. 367–374. IEEE (2009)

    Google Scholar 

  23. Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Elsevier, Amsterdam (1997)

    Book  Google Scholar 

  24. Friedman, J., Hastie, T., et al.: The Elements of Statistical Learning, vol. 27, no. 2, pp. 83–85. Springer, Heidelberg (2009)

    Google Scholar 

  25. Song, Q., Ni, J., Wang, G.: A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25(1), 1–14 (2012)

    Article  Google Scholar 

  26. Quinlan, J.R.: C4. 5: Programs for Machine Learning. Morgan Kaufmann, Los Altos (1992)

    Google Scholar 

  27. Moore, A.W.: Dataset. http://www.cl.cam.ac.uk/research/srg/netos/nprobe/data/papers. Accessed Aug 2013

  28. Croft, B., Metzler, D., Search, S.T.: Engines—information retrieval in practice. Comput. J. 54(5), 831–832 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the financial support from Natural Science Foundation of Zhangzhou, Fujian (Project No. ZZ2018J22).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingbo Xia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xia, J., Shen, J., Wu, Y. (2018). A Four-Stage Hybrid Feature Subset Selection Approach for Network Traffic Classification Based on Full Coverage. In: Wang, G., Chen, J., Yang, L. (eds) Security, Privacy, and Anonymity in Computation, Communication, and Storage. SpaCCS 2018. Lecture Notes in Computer Science(), vol 11342. Springer, Cham. https://doi.org/10.1007/978-3-030-05345-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05345-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05344-4

  • Online ISBN: 978-3-030-05345-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics