Abstract
The Internet has become a vital source of information; internal and external attacks threaten the integrity of the LAN connected to the Internet. In this work, several techniques have been described for detection of such threats. We have focussed on anomaly-based intrusion detection in the campus environment at the network edge. A campus LAN consisting of more than 9000 users with a 90 Mbps internet access link is a large network. Therefore, efficient techniques are required to handle such big data and to model user behaviour. Proxy server logs of a campus LAN and edge router traces have been used for anomalies like abusive Internet access, systematic downloading (internal threats) and DDoS attacks (external threat); our techniques involve machine learning and time series analysis applied at different layers in TCP/IP stack. Accuracy of our techniques has been demonstrated through extensive experimentation on huge and varied datasets. All the techniques are applicable at the edge and can be integrated into a Network Intrusion Detection System.
Similar content being viewed by others
References
Al-Nashif Y, Kumar A A, Hariri S, Qu G, Luo Y and Szidarovsky F 2008 Multi-level intrusion detection system. In: International Conference on Autonomic Computing
Arshadi L and Jahangir A -H 2011 Entropy based syn flooding detection. In: Local Computer Networks (LCN), 2011 IEEE 36th Conference on. IEEE
Baker G and Tenopir C 2006 Managing the unmanageable: Systematic downloading of electronic resources by library users. J. Library Admin. 44: 11–24
Berry M W, Browne M, Langville A N, Pauca V P and Plemmons R J 2007 Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52 (1): 155–173
Bhandari A, Khare S, Murthy H et al 2014 Systematic downloading: Analysis and detection. In: Signal Processing and Communication Systems (ICSPCS), 2014 8th International Conference on. IEEE
Blanco R and Lioma C 2007 Random walk term weighting for information retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM
Bommepally K, Glisa T, Prakash J, Singh S and Murthy H 2010 Internet activity analysis through proxy log. In: National Conference on Communications (NCC)
Canny J 1986 A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 8 (6): 679–698
Chin S C, Ray A and Rajagopalan V 2005 Symbolic time series analysis for anomaly detection: A comparative evaluation. Signal Process. 85 (9): 1859–1868. ISSN 0165-1684. URL http://www.sciencedirect.com/science/article/pii/S0165168405001039
Choi B and Yao Z 2005 Web page classification*. In: Foundations and Advances in Data Mining. Springer, 221–274
Chu S -I and Chang S -C 2007 Time-of-day internet-access management by combining empirical data-based pricing with quota-based priority control. IET Commun. 1: 587–596
Deerwester S, Dumais S T, Furnas G W, Landauer T K and Harshman R 1990 Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41: 391–407
Dini G, Fabio M, Saracino A and Sgandurra D 2012 MADAM: A multi-level anomaly detector for android malware. Lecture Notes in Computer Science 7531: 240–253
Divakaran D, Murthy H and Gonsalves T 2006a Detection of syn flooding attacks using linear prediction analysis. In: Networks, 2006. ICON ’06. 14th IEEE International Conference on, volume 1. ISSN 1556–6463
Divakaran D M, Murthy H A and Gonsalves T A 2006b Detection of syn flooding attacks using linear prediction analysis. In: Networks, 2006. ICON’06. 14th IEEE International Conference on, volume 1. IEEE
Divakaran D M, Murthy H A and Gonsalves T A 2006c Detection of SYN flooding attacks using linear prediction analysis. In: International Conference on Networks (ICON)
Garcia-Teodoro P, Verdejo J D, Fernandez G M and Vazquez E 2009 Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Security 28: 18–28
Guirguis M, Bestavros A, Matta I and Zhang Y 2005a Reduction of quality (roq) attacks on internet end-systems. In: INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings IEEE, volume 2. IEEE
Guirguis M, Bestavros A, Matta I and Zhang Y 2005b Reduction of Quality (RoQ) Attacks on Internet End-Systems. In: International Conference on Computer Communication (INFOCOM), volume 2
Hassan S, Mihalcea R and Banea C 2007 Random walk term weighting for improved text classification. Int. J. Semantic Comput. 1 (04): 421–439
He Z and Liu Z 2008 A novel approach to naive bayes web page automatic classification. In: Fuzzy Systems and Knowledge Discovery. 2008 FSKD ’08. Fifth International Conference on, volume 2
James C and Murthy H A 2012 Decoupling non-stationary and stationary components in long range network time series in the context of anomaly detection. In: Local Computer Networks (LCN). 2012 IEEE 37th Conference on. IEEE
Kan M -Y and Thi H O N 2005 Fast webpage classification using url features. In: Proceedings of the 14th ACM international conference on Information and knowledge management, CIKM ’05. ACM. ISBN 1-59593-140-6
Khare S, Bhandari A and Murthy H A 2014 Url classification using non negative matrix factorization. In: Communications (NCC), 2014 Twentieth National Conference on. IEEE
Kumar A, Hegde M, Anand S, Bindu B, Thirumurthy D and Kherani A 2000 Nonintrusive TCP connection admission control for bandwidth management of an internet access link. 38: 160–167
Lee D D and Seung H S 2001 Algorithms for non-negative matrix factorization. In: NIPS. MIT Press
Leland W E, Taqqu M S, Willinger W and Wilson D V 1994 On the self-similar nature of Ethernet traffic (extended version). IEEE/ACM Trans. Netw. 2 (1): 1–15
Lin T -C, Sun Y, Chang S -C, Chu S -I, Chou Y -T and Li M -W 2004 Management of abusive and unfair internet access by quota-based priority control. Comput. Netw. Int. J. Comput. Telecommun. Netw. 44: 441–462
Liu H and Kim M S 2010a Real-time detection of stealthy ddos attacks using time-series decomposition. In: Communications (ICC), 2010 IEEE International Conference on. ISSN. 1550–3607
Liu H and Kim M S 2010b Real-time detection of stealthy DDOS attacks using time-series decomposition. In: International Conference on Communications (ICC)
Mukherjee B, Heberlein L T and Levitt K N 1994 Network intrusion detection. IEEE Netw. 8: 26–41
Ndousse T and Okuda T 1996 Computational intelligence for distributed fault management in networks using fuzzy cognitive maps. In: Communications, 1996. ICC ’96, Conference Record, Converging Technologies for Tomorrow’s Applications. 1996 IEEE International Conference on, volume 3
Paine T A and Griggs T J 2008 Directing traffic: Managing internet bandwidth fairly. EDUCAUSE Q. 3: 66–70
Paliwal K 1992 On the use of line spectral frequency parameters for speech recognition. Digital Signal Process. 2 (2): 80–87
Paxson V and Floyd S 1995 Wide- Area traffic: The failure of Poisson modeling. IEEE/ACM Trans. Netw. 3 (3): 226–244
Peng H, Long F and Ding C 2005 Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. 27: 1226–1238
Porter M F 1980 An algorithm for suffix stripping. Program: electronic library and information systems 14 (3): 130–137
Qi X and Davison B D 2009 Web page classification: Features and algorithms. ACM Comput. Surv. 41 (2): 12:1–12:31. ISSN 0360-0300. URL http://doi.acm.org/10.1145/1459352.1459357
Rabiner L R and Gold B 1975 Theory and application of digital signal processing. Englewood Cliffs, NJ, Prentice-Hall, Inc., 1975, 777 p., 1
Ranjan N, Murthy H A and Gonsalves T A 2010 Detection of SYN flooding attacks using Generalized Autoregressive Conditional Heteroskedasticity (GARCH) modeling technique. In: National Conference on Communications (NCC)
Reynolds D, Quatieri T and Dunn R 2000 Speaker verification using adapted gaussian mixture models. Digital Signal Process. 10: 19–41
Salton G and Buckley C 1988 Term-weighting approaches in automatic text retrieval. In: Information processing and management
Seresht N A and Azmi R 2014 MAIS-IDS: A distributed intrusion detection system using multi-agent AIS approach. Eng. Appl. Artif. Intell. 35: 286–298
Singh S R, Murthy H A and Gonsalves T A 2010 Feature selection for text classification based on gini coefficient of inequality. In: H Liu, H Motoda, R Setiono and Z Zhao (eds.), FSDM, volume 10 of JMLR Proceedings. JMLR.org
Siris V A and Papagalou F 2004 Application of anomaly detection algorithms for detecting SYN flooding attacks. In: Global Telecommunications Conference (GLOBECOM)
Siris V A and Papagalou F 2006 Application of anomaly detection algorithms for detecting syn flooding attacks. Comput. Commun. 29 (9): 1433–1442
Tax D M J and Duin R P W 2004 Support vector data description. Mach. Learn. 54 (1): 45–66. ISSN 0885-6125. URL http://dx.doi.org/10.1023/B:MACH.0000008084.60811.49
TCPDUMP 1999 http://www.tcpdump.org/
Thottan M and Ji C 1998 Proactive anomaly detection using distributed intelligent agents. Netw. IEEE 12 (5): 21–27
Thottan M and Ji C 2003 Anomaly detection in ip networks. IEEE Trans. Signal Process. 51 (8): 2191–2204. ISSN 1053-587X
Wang H, Zhang D and Shin K 2002a SYN-dog: Sniffing SYN flooding sources. In: Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS)
Wang H, Zhang D and Shin K G 2002b Syn-dog: Sniffing syn flooding sources. In: Distributed Computing Systems, 2002. Proceedings. 22nd International Conference on. IEEE
Wu Q and Shao Z 2005 Network anomaly detection using time series analysis. In: Autonomic and Autonomous Systems and International Conference on Networking and Services, 2005. ICAS-ICNS 2005. Joint International Conference on
Ye N, Vilbert S and Chen Q 2003 Computer intrusion detection through ewma for autocorrelated and uncorrelated data. Reliability, IEEE Transactions on 52 (1): 75–82
Acknowledgements
We thank Mr. A Ramasamy for contributing resources from his thesis for this work. This work was carried out under the IU-ATC project funded by the Department of Science and Technology (DST), Government of India and the UK EPSRC Digital Economy Programme.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
SAIT, S.Y., BHANDARI, A., KHARE, S. et al. Multi-level anomaly detection: Relevance of big data analytics in networks. Sadhana 40, 1737–1767 (2015). https://doi.org/10.1007/s12046-015-0416-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12046-015-0416-0