Clustering and Principal Feature Selection Impact for Internet Traffic Classification Using K-NN

Wiradinata, Trianggoro; Adi Suryaputra, P.

doi:10.1007/978-981-287-988-2_7

Trianggoro Wiradinata⁶ &
P. Adi Suryaputra⁶

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 365))

1212 Accesses
1 Citations

Abstract

K-NN is a classification algorithm which suitable for large amounts of data and have higher accuracy for internet traffic classification, unfortunately K-NN algorithm has disadvantage in computation time because K-NN algorithm calculates the distance of all data in some dataset. This research provide alternative solution to overcome K-NN computation time, the alternative solution is to implement clustering process before the classification process. Clustering process does not require high computation time. Fuzzy C-Mean algorithm is implemented in this research. The Fuzzy C-Mean algorithm clusters the based datasets that be entered. Fuzzy C-Mean has disadvantage of clustering, that is the results are often not the same even though the input data are same, and the initial dataset that of the Fuzzy C-Mean is not optimal, to optimize the initial datasets, in this research, feature selection algorithm is used, after selecting the main feature of dataset, the output from fuzzy C-Mean become consistent. Selection of the features is a method that is expected to provide an initial dataset that is optimum for the algorithm Fuzzy C-Means. Algorithms for feature selection in this study used is Principal Component Analysis (PCA). PCA reduced nonsignificant attribute to created optimal dataset and can improve performance clustering and classification algorithm. Results of this research is clustering and principal feature selection give signifanct impact in accuracy and computation time for internet traffic classification. The combination from this three methods have successfully modeled to generate a data classification method of internet bandwidth usage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lou, X., Li, J., Liu, H.: Improved fuzzy C-means clustering algorithm based on cluster density related work. J. Comput. Inf. Syst. 2(January), 727–737 (2012)
Google Scholar
Zhang, L., Liu, Q., Yang, W., Wei, N., Dong, D.: An improved k-nearest neighbor model for short-term traffic flow prediction. In: Procedia—Social and Behavioral Sciences, vol. 96 (Cictp), pp. 653–662 (2013). doi:10.1016/j.sbspro.2013.08.076
Google Scholar
Lee, Y.-H., Wei, C.-P., Cheng, T.-H., Yang, C.-T.: Nearest-neighbor-based approach to time-series classification. Decis. Support Syst. 53(1), 207–217 (2012). doi:10.1016/j.dss.2011.12.014
Article Google Scholar
Berget, I., Mevik, B.-H., Næs, T.: New modifications and applications of fuzzy-means methodology. Comput. Stat. Data Anal. 52(5), 2403–2418 (2008). doi:10.1016/j.csda.2007.10.020
Article MATH Google Scholar
Esbensen, K.H.: Principal Component Analysis: Concept, Geometrical Interpretation, Mathematical Background, Algorithms, History, Practice. Elsevier, New York (2009)
Google Scholar
Wang, F.: Factor Analysis and Principal-Component Analysis. Elsevier, New York (2009)
Google Scholar
Paramita, A.S.: Feature selection technique using principal component analysis for improving fuzzy C-mean internet traffic classification. Aust. J. Basic Appl. Sci. 8(14), 13–18 (2014)
Google Scholar
Antonio, T., Paramita, A.S.: Full paper feature selection technique impact for internet traffic classification using naïve Bayesian. JurnalTeknologi 20, 85–88 (2014)
Google Scholar

Download references

Acknowledgments

We would like to thank to Indonesian Higher Education and Research for this opportunity and research grant, and also for University Of Ciputra for research facility.

Author information

Authors and Affiliations

University of Ciputra, UC Town Citraland, Surabaya, Indonesia
Trianggoro Wiradinata & P. Adi Suryaputra

Authors

Trianggoro Wiradinata
View author publications
You can also search for this author in PubMed Google Scholar
P. Adi Suryaputra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. Adi Suryaputra .

Editor information

Editors and Affiliations

Electrical Engineering Department, Petra Christian University, Surabaya, Indonesia
Felix Pasila
Petra Christian University, Surabaya, Indonesia
Yusak Tanoto
Electrical Engineering Department, Petra Christian University, Surabaya, Indonesia
Resmana Lim
Electrical Engineering Department, Petra Christian University, Surabaya, Indonesia
Murtiyanto Santoso
University of Surabaya, Surabaya, Indonesia
Nemuel Daniel Pah

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wiradinata, T., Adi Suryaputra, P. (2016). Clustering and Principal Feature Selection Impact for Internet Traffic Classification Using K-NN. In: Pasila, F., Tanoto, Y., Lim, R., Santoso, M., Pah, N. (eds) Proceedings of Second International Conference on Electrical Systems, Technology and Information 2015 (ICESTI 2015). Lecture Notes in Electrical Engineering, vol 365. Springer, Singapore. https://doi.org/10.1007/978-981-287-988-2_7

Download citation

DOI: https://doi.org/10.1007/978-981-287-988-2_7
Published: 30 January 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-986-8
Online ISBN: 978-981-287-988-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics