Network traffic analysis is applied to detect intrusions and manage application traffic. Continuous batch network traffic analysis is a computationally demanding task. Because of traffic intensity variations due to the natural peaks and crests of network traffic intensity, a network analysis cluster may have to be severely over-dimensioned to support 24/7 continuous packet block capture and processing. In this paper, we characterize the computational requirements of the network traffic packets for several conditions, which constitute a useful tool for generating a network workload in simulated scenarios. Our target MapReduce jobs are map-intensive, including string matching-based virus and malware detection. We present an architecture for a Hadoop-based network analysis solution including a scheduler, report on using this approach in a small cluster, and show scheduling performance results obtained through simulation. The scheduler considers a cloud-based traffic analysis solution that bursts traffic to the cloud to overcome local resource limitations. The results show that we are able to reduce the amount of the traffic to burst out by up to 50 % and still accomplish a continuous batch traffic analysis with single-job comparable run times.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Stephen McGough A, Forshaw M, Gerrard C, Wheater S, Allen B, Robinson P (2014) Comparison of a cost-effective virtual cloud cluster with an existing campus cluster. Future Gen Comput Syst 41:65–78
Guo T, Sharma U, Shenoy P, Wood T, Sahu S (2014) Cost-aware cloud bursting for enterprise applications. ACM Trans Internet Technol 13(3):1–24
Nair SK et al (2010) Towards secure cloud bursting, brokerage and aggregation. In: Proceedings of the 8th IEEE European conference on web services, ECOWS 2010, pp 189–196
Lee Y, Lee Y (2012) Toward scalable internet traffic measurement and analysis with Hadoop. ACM SIGCOMM Comput Commun Rev 43(1):5–13
RIPE (2012) Large-scale PCAP data analysis using Apache Hadoop. https://github.com/RIPE-NCC/hadoop-pcap
Pallavi A, Hemlata P (2012) Network traffic analysis using packet sniffer. Int J Eng Res Appl 2(3):854–856
Bicer T, Chiu D, Agrawal G (2011) A framework for data-intensive computing with cloud bursting. 2011 IEEE international conference on cluster computing, pp 169–177
Kailasam S, Dhawalia P, Balaji SJ, Iyer G, Dharanipragada J (2014) Extending MapReduce across clouds with BStream. IEEE Trans Cloud Comput 2(3):362–376
Chang H, Kodialam M, Kompella RR, Lakshman TV, Lee M, Mukherjee S (2011) Scheduling in mapreduce-like systems for fast completion time. IEEE INFOCOM, pp 3074–3082
Mattess M, Calheiros RN, Buyya R (2013) Scaling MapReduce applications across hybrid clouds to meet soft deadlines. International conference on advanced information networking and applications, pp 629–636
Verma A, Cherkasova L, Kumar VS, Campbell RH (2012) Deadline-based workload management for MapReduce environments: pieces of the performance puzzle. In: Proceedings of network operations and management symposium, pp 900–905
Dong X, Wang Y, Liao H (2011) Scheduling mixed real-time and non-real-time applications in MapReduce environment. International conference on parallel and distributed systems, pp 9–16
Hwang E, Kim KH (2012) Minimizing cost of virtual machines for deadline-constrained MapReduce applications in the cloud international conference on grid computing, pp 130–138
Kc K, Anyanwu K (2010) Scheduling hadoop jobs to meet deadlines. In: Proceedings of IEEE second international conference on cloud computing technology and science, Indianapolis, pp 388–392
Lim N, Majumdar S, Ashwood-Smith P (2014) A constraint programming-based resource management technique for processing MapReduce jobs with SLAs on clouds. International conference on parallel processing (ICPP), pp 411–421
Gaj P, Kwiecie A, Stera P (2015) Estimating the intensity of long-range dependence in real and synthetic traffic traces. Springer Comput Netw 522:11–22
Work (partially) funded by the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 within project POCI-01-0145-FEDER-006961, and by FCT – Portuguese Foundation for Science and Technology as part of projects UID/EEA/50014/2013 and UID/CEC/00027/2013.
About this article
Cite this article
Morla, R., Gonçalves, P. & Barbosa, J.G. High-performance network traffic analysis for continuous batch intrusion detection. J Supercomput 72, 4107–4128 (2016). https://doi.org/10.1007/s11227-016-1743-6
- Packet network traffic analysis
- Cloud bursting
- On-line scheduling