Abstract
In certain cyber-attack scenarios, such as flooding denial of service attacks, the data distribution changes significantly. This forms a collective anomaly, where some similar kinds of normal data instances appear in abnormally large numbers. Since they are not rare anomalies, existing anomaly detection techniques cannot properly identify them. This paper investigates detecting this behaviour using the existing clustering and co-clustering based techniques and utilizes the network traffic modelling technique via Hurst parameter to propose a more effective algorithm combining clustering and Hurst parameter. Experimental analysis reflects that the proposed Hurst parameter-based technique outperforms existing collective and rare anomaly detection techniques in terms of detection accuracy and false positive rates. The experimental results are based on benchmark datasets such as KDD Cup 1999 and UNSW-NB15 datasets.
Similar content being viewed by others
References
Yu R, He X, Liu Y (2014) Glad: group anomaly detection in social media analysis. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14, ACM, New York, pp 372–381
Ahmed M, Mahmood A, Hu J (2015) A survey of network anomaly detection techniques. J Netw Comput Appl 60:19–31
Ahmed M, Mahmood AN, Hu J (2014) Outlier detection. CRC Press, New York, Ch. 1, pp 3–21, (in book: The State of the Art in Intrusion Prevention and Detection)
Ahmed M, Mahmood AN, Islam MR (2016) A survey of anomaly detection techniques in financial domain. Future Gener Comput Syst 55:278–288
Ahmed M, Anwar A, Mahmood AN, Shah Z, Maher MJ (2015) An investigation of performance analysis of anomaly detection techniques for big data in scada systems. EAI Endorsed Trans Ind Netw Intell Syst 15(3):5
Ahmed M, Mahmood A (2014) Network traffic analysis based on collective anomaly detection. In: 9th IEEE international conference on industrial electronics and applications, IEEE, pp 1141–1146
Ahmed M, Mahmood AN (2015) Novel approach for network traffic pattern analysis using clustering-based collective anomaly detection. Ann Data Sci 2(1):111–130
Ahmed M, Mahmood A (2015) Network traffic pattern analysis using improved information theoretic co-clustering based collective anomaly detection. In: International conference on security and privacy in communication networks, vol. 153, Springer, Berlin, pp 204–219
Ahmed M (2017) Thwarting dos attacks: a framework for detection based on collective anomalies and clustering. Computer 50(9):76–82
Hawkins D (1980) Identification of outliers (monographs on statistics and applied probability), 1st edn. Springer, Berlin
Ahmed M, Choudhury N, Uddin S (2017) Anomaly detection on big data in financial markets. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, ser. ASONAM ’17. ACM, New York, pp 998–1001
Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. SIGMOD Rec 29(2):93–104
Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. SIGMOD Rec 29(2):427–438
Struyf A, Hubert M, Rousseeuw P (1997) Clustering in an object-oriented environment. J Stat Softw 1(4):1–30
Muandet K, Schlkopf B (2013) One-class support measure machines for group anomaly detection. CoRR abs/1303.0309
Verma K, Hasbullah H, Kumar A (2013) An efficient defense method against udp spoofed flooding traffic of denial of service (dos) attacks in vanet. In: Advance computing conference (IACC), 2013 IEEE 3rd international, pp 550–555
Mandelbrot BB, Wallis JR (1969) Robustness of the rescaled range R/S in the measurement of noncyclic long run statistical dependence. Water Resour Res 5(5):967–988
Leung K, Leckie C (2005) Unsupervised anomaly detection in network intrusion detection using clusters. In: Proceedings of the twenty-eighth Australasian conference on computer science—vol 38, ACSC ’05, Australian Computer Society, Inc., Darlinghurst, Australia, pp 333–342
Casas P, Mazel J, Owezarski P (2012) Unsupervised network intrusion detection systems: detecting the unknown without knowledge. Comput Commun 35(7):772–783
Papalexakis EE, Beutel A, Steenkiste P (2012) Network anomaly detection using co-clustering. In: Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), ASONAM ’12, IEEE Computer Society, Washington, pp 403–410
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD cup 99 dataset. In: Proceedings of the 2nd IEEE international conference on computational intelligence for security and defense applications, CISDA’09, IEEE Press, Piscataway, pp 53–58
Banerjee A, Dhillon I, Ghosh J, Merugu S, Modha DS (2007) A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. J Mach Learn Res 8:1919–1986
Li M (2006) Change trend of averaged Hurst parameter of traffic under DDoS flood attacks. Comput Secur 25(3):213–220
Pelleg D, Moore AW (2000) X-means: extending k-means with efficient estimation of the number of clusters. In: Proceedings of the 17th international conference on machine learning, ICML ’00. Morgan Kaufmann Publishers Inc., San Francisco, pp 727–734
Laskov P, Düssel P, Schäfer C, Rieck K (2005) Learning intrusion detection: supervised or unsupervised? Springer, Berlin, pp 50–57
Ahmed M (2017) An unsupervised approach of knowledge discovery from big data in social network. EAI Endorsed Trans Scalable Inf Syst 17(14):9
Ahmed M (2017) Infrequent pattern identification in SCADA systems using unsupervised learning, IGI Global, Hershey, PA, 2017, Ch. 11, pp 215–225, (in book: Security Solutions and Applied Cryptography in Smart Grid Communications)
Papadimitriou CH (2003) Computational complexity. In: Encyclopedia of computer science. Wiley, Chichester, pp 260–265
Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129
Hurst HE (1951) Long-term storage capacity of reservoirs. Trans Am Soc Civ Eng 116:770–808
Clegg R (2005) A practical guide to measuring the Hurst parameter. In: Proceedings of 21st UK performance engineering workshop, school of computing science, Technical Repo, N. Thomas, N. Thomas
Amer M, Goldstein M (2012) Nearest-neighbor and clustering based anomaly detection algorithms for rapidminer. In: Fischer S, Mierswa I (eds) Proceedings of the 3rd RapidMiner community meeting and conferernce (RCOMM 2012), Shaker Verlag GmbH, pp 1–12
Moustafa N, Slay J (2015) Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: Military communications and information systems conference (MilCIS), pp 1–6
Ahmed M (2016) Detecting rare and collective anomalies in network traffic data using summarization. Ph.d. theses, UNSW Australia
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ahmed, M. Collective Anomaly Detection Techniques for Network Traffic Analysis. Ann. Data. Sci. 5, 497–512 (2018). https://doi.org/10.1007/s40745-018-0149-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40745-018-0149-0