Skip to main content

Clustering and Principal Feature Selection Impact for Internet Traffic Classification Using K-NN

  • Conference paper
  • First Online:
Proceedings of Second International Conference on Electrical Systems, Technology and Information 2015 (ICESTI 2015)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 365))

Abstract

K-NN is a classification algorithm which suitable for large amounts of data and have higher accuracy for internet traffic classification, unfortunately K-NN algorithm has disadvantage in computation time because K-NN algorithm calculates the distance of all data in some dataset. This research provide alternative solution to overcome K-NN computation time, the alternative solution is to implement clustering process before the classification process. Clustering process does not require high computation time. Fuzzy C-Mean algorithm is implemented in this research. The Fuzzy C-Mean algorithm clusters the based datasets that be entered. Fuzzy C-Mean has disadvantage of clustering, that is the results are often not the same even though the input data are same, and the initial dataset that of the Fuzzy C-Mean is not optimal, to optimize the initial datasets, in this research, feature selection algorithm is used, after selecting the main feature of dataset, the output from fuzzy C-Mean become consistent. Selection of the features is a method that is expected to provide an initial dataset that is optimum for the algorithm Fuzzy C-Means. Algorithms for feature selection in this study used is Principal Component Analysis (PCA). PCA reduced nonsignificant attribute to created optimal dataset and can improve performance clustering and classification algorithm. Results of this research is clustering and principal feature selection give signifanct impact in accuracy and computation time for internet traffic classification. The combination from this three methods have successfully modeled to generate a data classification method of internet bandwidth usage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lou, X., Li, J., Liu, H.: Improved fuzzy C-means clustering algorithm based on cluster density related work. J. Comput. Inf. Syst. 2(January), 727–737 (2012)

    Google Scholar 

  2. Zhang, L., Liu, Q., Yang, W., Wei, N., Dong, D.: An improved k-nearest neighbor model for short-term traffic flow prediction. In: Procedia—Social and Behavioral Sciences, vol. 96 (Cictp), pp. 653–662 (2013). doi:10.1016/j.sbspro.2013.08.076

    Google Scholar 

  3. Lee, Y.-H., Wei, C.-P., Cheng, T.-H., Yang, C.-T.: Nearest-neighbor-based approach to time-series classification. Decis. Support Syst. 53(1), 207–217 (2012). doi:10.1016/j.dss.2011.12.014

    Article  Google Scholar 

  4. Berget, I., Mevik, B.-H., Næs, T.: New modifications and applications of fuzzy-means methodology. Comput. Stat. Data Anal. 52(5), 2403–2418 (2008). doi:10.1016/j.csda.2007.10.020

    Article  MATH  Google Scholar 

  5. Esbensen, K.H.: Principal Component Analysis: Concept, Geometrical Interpretation, Mathematical Background, Algorithms, History, Practice. Elsevier, New York (2009)

    Google Scholar 

  6. Wang, F.: Factor Analysis and Principal-Component Analysis. Elsevier, New York (2009)

    Google Scholar 

  7. Paramita, A.S.: Feature selection technique using principal component analysis for improving fuzzy C-mean internet traffic classification. Aust. J. Basic Appl. Sci. 8(14), 13–18 (2014)

    Google Scholar 

  8. Antonio, T., Paramita, A.S.: Full paper feature selection technique impact for internet traffic classification using naïve Bayesian. JurnalTeknologi 20, 85–88 (2014)

    Google Scholar 

Download references

Acknowledgments

We would like to thank to Indonesian Higher Education and Research for this opportunity and research grant, and also for University Of Ciputra for research facility.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Adi Suryaputra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Wiradinata, T., Adi Suryaputra, P. (2016). Clustering and Principal Feature Selection Impact for Internet Traffic Classification Using K-NN. In: Pasila, F., Tanoto, Y., Lim, R., Santoso, M., Pah, N. (eds) Proceedings of Second International Conference on Electrical Systems, Technology and Information 2015 (ICESTI 2015). Lecture Notes in Electrical Engineering, vol 365. Springer, Singapore. https://doi.org/10.1007/978-981-287-988-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-287-988-2_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-287-986-8

  • Online ISBN: 978-981-287-988-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics