Abstract
The cyberspace continues to evolve more complex than ever anticipated, and same is the case with security dynamics there. As our dependence on cyberspace is increasing day-by-day, regular and systematic monitoring of cyberspace security has become very essential. A darknet is one such monitoring framework for deducing malicious activities and the attack patterns in the cyberspace. Darknet traffic is the spurious traffic observed in the empty address space, i.e., a set of globally valid Internet Protocol (IP) addresses which are not assigned to any hosts or devices. In an ideal secure network system, no traffic is expected to arrive on such a darknet IP space. However, in reality, noticeable amount of traffic is observed in this space primarily due to the Internet wide malicious activities, attacks and sometimes due to the network level misconfigurations. Analyzing such traffic and finding distinct attack patterns present in them can be a potential mechanism to infer the attack trends in the real network. In this paper, the existing Basic and Extended AGgregate and Mode (AGM) data formats for darknet traffic analysis is studied and an efficient 29-tuple Numerical AGM data format suitable for analyzing the source IP address validated TCP connections (three-way handshake) is proposed to find attack patterns in this traffic using Mean Shift clustering algorithm. Analyzing the patterns detected from the clusters results in providing the traces of various attacks such as Mirai bot, SQL attack, and brute force. Analyzing the source IP validated TCP, darknet traffic is a potential technique in Cyber security to find the attack trends in the network.
Similar content being viewed by others
References
Cavelty MD. Contemporary security studies. Oxford: Oxford University Press; 2018.
Pang R, Yegneswaran V, Barford P, Paxson V, Peterson L. Characteristics of internet background radiation. In: Proceedings of the 4th ACM SIGCOMM conference on internet measurement—IMC’04. ACM; 2004. p. 27–40.
Iglesias F, Zseby T. Pattern discovery in internet background radiation. In: IEEE transactions on big data (Early Access), July 2017.
CAIDA. The CAIDA UCSD Network Telescope “Patch Tuesday” Dataset. 2014. http://www.caida.org/data/passive/telescope-patch-tuesday_dataset.xml.
Fachkhka C, Debbabi M. Darknet as a source of cyber intelligence: survey, taxonomy, and characterization. IEEE Commun Surv Tutor. 2016;18(2):1197–227 (Second Quarter).
Bhuyan MH, Bhattacharyya DK, Kalita JK. Network anomaly detection: methods, systems and tools. IEEE Commun Surv Tutor. 2014;16(1):303–36.
Sperotto A, Schaffrath G, Sadre R, Morariu C, Pras A, Stiller B. An overview of ip flow-based intrusion detection. IEEE Commun Surv Tutor. 2010;12(3):343–56.
CAIDA. CoralReef. 2016. https://www.caida.org/tools/measurement/coralreef/.
Keys K, Moore D, Koga R, Lagache E, Tesch M. The architecture of CoralReef: an internet traffic monitoring software suite. In: Passive and active network measurement workshop (PAM); 2001.
Iglesias F, Zseby T. Modelling ip darkspace traffic by means of clustering techniques. 27 in 2014 IEEE conference on communications and network security; 2014. p. 166–174.
Wang Q, Chen Z, Chen C. Darknet-based inference of internet worm temporal characteristics. IEEE Trans Inf Forensics Secur. 2011;6(4):1382–93.
Dainotti A, King A, Claffy K, Papale F, Pescape A. Analysis of a “/0” stealth scan from a botnet. IEEE/ACM Trans Netw. 2015;23(2):341–54.
Bou-Harb E, Husak M, Debbabi M, Assi C. Big data sanitization and cyber situational awareness: a network telescope perspective. In: IEEE transactions on big data; 2018. p. 1.
Bou-Harb E, Assi C, Debbabi M. Csc-detector: a system to infer large-scale probing campaigns. IEEE Trans Dependable Secure Comput. 2016;15(3):364–77.
Bellman RE. Dynamic programming. New York: Dover Publications, Inc; 2003.
Person K. LIII on lines and planes of closest fit to systems of points in space. Lond Edinb Dublin Philos Mag J Sci. 1901;2(11):559–72. https://doi.org/10.1080/14786440109462720.
Hotelling H. Analysis of a complex of statistical variables into principal components. J Educ Psychol. 1933;24(6):417–41.
Hartigan JA, Wong MA. Algorithm AS 136: a K-means clustering algorithm. J R Stat Soc Ser C. 1979;28(1):100–8.
Guha S, Rastogi R, Shim K. CURE: an efficient clustering algorithm for large databases. SIGMOD Rec. 1998;27(2):73–84. https://doi.org/10.1145/276305.276312.
Zhang T, Ramakrishnan R, Livny M. BIRCH: an efficient data clustering method for very large databases. SIGMOD Rec. 1996;25(2):103–14. https://doi.org/10.1145/235968.233324.
Cheng Y. Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell. 1995;17(8):790–9.
Kolias C, Kambourakis G, Stavrou A, Voas J. DDoS in the IoT: Mirai and other botnets. Computer. 2017;50(7):80–4.
TCP ports information. https://www.speedguide.net/.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Advances in Internet Research and Engineering” guest edited by Mohit Sethi, Debabrata Das, P. V. Ananda Mohan and Balaji Rajendran.
Rights and permissions
About this article
Cite this article
Niranjana, R., Kumar, V.A. & Sheen, S. Darknet Traffic Analysis and Classification Using Numerical AGM and Mean Shift Clustering Algorithm. SN COMPUT. SCI. 1, 16 (2020). https://doi.org/10.1007/s42979-019-0016-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-019-0016-x