Abstract
Depending on the use of the Internet and network, data-stream classification has been applied in the intrusion detection field. Due to unlimited and difficult storage features, the routine classification algorithm (eg. C4.5, currently widely used classification algorithm with higher classification accuracy) tends to incorrect classification and memory leaks. In this paper, we propose an improved Hoeffding tree data-stream classification algorithm, Hoeffding-ID and apply it to the network data-stream process of the intrusion detection field. Experimental results shows that the Hoeffding-ID algorithm has relative high detection accuracy, low positives rate and memory usage not increasing with the data samples.
Similar content being viewed by others
References
Liao H-J, Lin C-HR, Lin Y-C, Tung K-Y (2013) Intrusion detection system: a comprehensive review. J Netw Comput Appl 36:16–24
Yin C (2014) Towards accurate node-based detection of P2P Botnets. Sci World J 2014:425491
Yin C, Zou M, Iko D, Wang J (2013) Botnet detection based on correlation of malicious behaviors. Int J Hybrid Inf Technol 6:6
Porras PA, Kemmerer RA (1992) Penetration state transition analysis: a rule-based intrusion detection approach. The 8th Annual Computer Security Applications Conference, pp 220–229
Wenke L (2002) Applying data mining to intrusion detection: the quest for automation, efficiency and credibility. ACM SIGKDD Explor Newsl 5:35–42
Gu B, Sheng VS, Tay KY, Romano W, Li S (2014) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst. 26(7):1403–1416
Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for \(\upnu \)-support vector regression. Neural Netw 67:140–150
Aggarwal C (2003) A framework for diagnosing changes in evolving date streams. Proc ACM SIGMOD Conf 11:575–586
LawYN, Zaniolo C (2005) An adaptive nearest neighbor classification algorithm for data streams. In: Jorge AM, Torgo L, Brazdil P, Camacho R, Gama J (eds) Knowledge Discovery in Databases: PKDD 2005. The 9th European Conference on the Principals and Practice of Knowledge Discovery in Databases, pp 108–120
Grossi V, Turini F (2012) Stream mining: a novel architecture for ensemble-based classification. Knowl Inf Syst 30:247–281
Domingos P, Hulten G (2003) Mining high-speed data stream. In: Proceedings of the Association for Computing Machinery sixth International Conference on Knowledge Discovery and Data Mining, pp 71–80
Bifet A, Frank E, Holmes G, Pfahringer B (2012) Ensembles of Restricted Hoeffding Trees. Acm T Intel Syst Tec 3(2):565–582
Ranjitha K, Krishna Kumari P (2014) Adaptive anomaly intrusion detection system using optimized Hoeffding tree. ARPN J Eng Appl Sci 9(10):1903–1910
Gama J, Rocha R, Medas P (2003) Accurate decision trees for mining high-speed data streams. The 9th ACM SIG KDD Int’l Conf on Knowledge Discovery and Data Mining 10:523–528
Acknowledgments
Foundation item: This work was funded by the National Natural Science Foundation of China (No. 61373134). It was also supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), Jiangsu Key Laboratory of Meteorological Observation and Information Processing (No. KDXS1105) and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yin, C., Feng, L. & Ma, L. An improved Hoeffding-ID data-stream classification algorithm. J Supercomput 72, 2670–2681 (2016). https://doi.org/10.1007/s11227-015-1573-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-015-1573-y