Skip to main content
Log in

Multi-level anomaly detection: Relevance of big data analytics in networks

  • Published:
Sadhana Aims and scope Submit manuscript

Abstract

The Internet has become a vital source of information; internal and external attacks threaten the integrity of the LAN connected to the Internet. In this work, several techniques have been described for detection of such threats. We have focussed on anomaly-based intrusion detection in the campus environment at the network edge. A campus LAN consisting of more than 9000 users with a 90 Mbps internet access link is a large network. Therefore, efficient techniques are required to handle such big data and to model user behaviour. Proxy server logs of a campus LAN and edge router traces have been used for anomalies like abusive Internet access, systematic downloading (internal threats) and DDoS attacks (external threat); our techniques involve machine learning and time series analysis applied at different layers in TCP/IP stack. Accuracy of our techniques has been demonstrated through extensive experimentation on huge and varied datasets. All the techniques are applicable at the edge and can be integrated into a Network Intrusion Detection System.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22

Similar content being viewed by others

References

  • Al-Nashif Y, Kumar A A, Hariri S, Qu G, Luo Y and Szidarovsky F 2008 Multi-level intrusion detection system. In: International Conference on Autonomic Computing

  • Arshadi L and Jahangir A -H 2011 Entropy based syn flooding detection. In: Local Computer Networks (LCN), 2011 IEEE 36th Conference on. IEEE

  • Baker G and Tenopir C 2006 Managing the unmanageable: Systematic downloading of electronic resources by library users. J. Library Admin. 44: 11–24

    Article  Google Scholar 

  • Berry M W, Browne M, Langville A N, Pauca V P and Plemmons R J 2007 Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52 (1): 155–173

    Article  MATH  MathSciNet  Google Scholar 

  • Bhandari A, Khare S, Murthy H et al 2014 Systematic downloading: Analysis and detection. In: Signal Processing and Communication Systems (ICSPCS), 2014 8th International Conference on. IEEE

  • Blanco R and Lioma C 2007 Random walk term weighting for information retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM

  • Bommepally K, Glisa T, Prakash J, Singh S and Murthy H 2010 Internet activity analysis through proxy log. In: National Conference on Communications (NCC)

  • Canny J 1986 A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 8 (6): 679–698

    Article  Google Scholar 

  • Chin S C, Ray A and Rajagopalan V 2005 Symbolic time series analysis for anomaly detection: A comparative evaluation. Signal Process. 85 (9): 1859–1868. ISSN 0165-1684. URL http://www.sciencedirect.com/science/article/pii/S0165168405001039

    Article  MATH  Google Scholar 

  • Choi B and Yao Z 2005 Web page classification*. In: Foundations and Advances in Data Mining. Springer, 221–274

  • Chu S -I and Chang S -C 2007 Time-of-day internet-access management by combining empirical data-based pricing with quota-based priority control. IET Commun. 1: 587–596

    Article  Google Scholar 

  • Deerwester S, Dumais S T, Furnas G W, Landauer T K and Harshman R 1990 Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41: 391–407

    Article  Google Scholar 

  • Dini G, Fabio M, Saracino A and Sgandurra D 2012 MADAM: A multi-level anomaly detector for android malware. Lecture Notes in Computer Science 7531: 240–253

    Article  Google Scholar 

  • Divakaran D, Murthy H and Gonsalves T 2006a Detection of syn flooding attacks using linear prediction analysis. In: Networks, 2006. ICON ’06. 14th IEEE International Conference on, volume 1. ISSN 1556–6463

  • Divakaran D M, Murthy H A and Gonsalves T A 2006b Detection of syn flooding attacks using linear prediction analysis. In: Networks, 2006. ICON’06. 14th IEEE International Conference on, volume 1. IEEE

  • Divakaran D M, Murthy H A and Gonsalves T A 2006c Detection of SYN flooding attacks using linear prediction analysis. In: International Conference on Networks (ICON)

  • Garcia-Teodoro P, Verdejo J D, Fernandez G M and Vazquez E 2009 Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Security 28: 18–28

    Article  Google Scholar 

  • Guirguis M, Bestavros A, Matta I and Zhang Y 2005a Reduction of quality (roq) attacks on internet end-systems. In: INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings IEEE, volume 2. IEEE

  • Guirguis M, Bestavros A, Matta I and Zhang Y 2005b Reduction of Quality (RoQ) Attacks on Internet End-Systems. In: International Conference on Computer Communication (INFOCOM), volume 2

  • Hassan S, Mihalcea R and Banea C 2007 Random walk term weighting for improved text classification. Int. J. Semantic Comput. 1 (04): 421–439

    Article  Google Scholar 

  • He Z and Liu Z 2008 A novel approach to naive bayes web page automatic classification. In: Fuzzy Systems and Knowledge Discovery. 2008 FSKD ’08. Fifth International Conference on, volume 2

  • James C and Murthy H A 2012 Decoupling non-stationary and stationary components in long range network time series in the context of anomaly detection. In: Local Computer Networks (LCN). 2012 IEEE 37th Conference on. IEEE

  • Kan M -Y and Thi H O N 2005 Fast webpage classification using url features. In: Proceedings of the 14th ACM international conference on Information and knowledge management, CIKM ’05. ACM. ISBN 1-59593-140-6

  • Khare S, Bhandari A and Murthy H A 2014 Url classification using non negative matrix factorization. In: Communications (NCC), 2014 Twentieth National Conference on. IEEE

  • Kumar A, Hegde M, Anand S, Bindu B, Thirumurthy D and Kherani A 2000 Nonintrusive TCP connection admission control for bandwidth management of an internet access link. 38: 160–167

  • Lee D D and Seung H S 2001 Algorithms for non-negative matrix factorization. In: NIPS. MIT Press

  • Leland W E, Taqqu M S, Willinger W and Wilson D V 1994 On the self-similar nature of Ethernet traffic (extended version). IEEE/ACM Trans. Netw. 2 (1): 1–15

    Article  Google Scholar 

  • Lin T -C, Sun Y, Chang S -C, Chu S -I, Chou Y -T and Li M -W 2004 Management of abusive and unfair internet access by quota-based priority control. Comput. Netw. Int. J. Comput. Telecommun. Netw. 44: 441–462

    Google Scholar 

  • Liu H and Kim M S 2010a Real-time detection of stealthy ddos attacks using time-series decomposition. In: Communications (ICC), 2010 IEEE International Conference on. ISSN. 1550–3607

  • Liu H and Kim M S 2010b Real-time detection of stealthy DDOS attacks using time-series decomposition. In: International Conference on Communications (ICC)

  • Mukherjee B, Heberlein L T and Levitt K N 1994 Network intrusion detection. IEEE Netw. 8: 26–41

    Article  Google Scholar 

  • Ndousse T and Okuda T 1996 Computational intelligence for distributed fault management in networks using fuzzy cognitive maps. In: Communications, 1996. ICC ’96, Conference Record, Converging Technologies for Tomorrow’s Applications. 1996 IEEE International Conference on, volume 3

  • Paine T A and Griggs T J 2008 Directing traffic: Managing internet bandwidth fairly. EDUCAUSE Q. 3: 66–70

    Google Scholar 

  • Paliwal K 1992 On the use of line spectral frequency parameters for speech recognition. Digital Signal Process. 2 (2): 80–87

    Article  Google Scholar 

  • Paxson V and Floyd S 1995 Wide- Area traffic: The failure of Poisson modeling. IEEE/ACM Trans. Netw. 3 (3): 226–244

    Article  Google Scholar 

  • Peng H, Long F and Ding C 2005 Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. 27: 1226–1238

  • Porter M F 1980 An algorithm for suffix stripping. Program: electronic library and information systems 14 (3): 130–137

    Article  Google Scholar 

  • Qi X and Davison B D 2009 Web page classification: Features and algorithms. ACM Comput. Surv. 41 (2): 12:1–12:31. ISSN 0360-0300. URL http://doi.acm.org/10.1145/1459352.1459357

    Article  Google Scholar 

  • Rabiner L R and Gold B 1975 Theory and application of digital signal processing. Englewood Cliffs, NJ, Prentice-Hall, Inc., 1975, 777 p., 1

  • Ranjan N, Murthy H A and Gonsalves T A 2010 Detection of SYN flooding attacks using Generalized Autoregressive Conditional Heteroskedasticity (GARCH) modeling technique. In: National Conference on Communications (NCC)

  • Reynolds D, Quatieri T and Dunn R 2000 Speaker verification using adapted gaussian mixture models. Digital Signal Process. 10: 19–41

    Article  Google Scholar 

  • Salton G and Buckley C 1988 Term-weighting approaches in automatic text retrieval. In: Information processing and management

  • Seresht N A and Azmi R 2014 MAIS-IDS: A distributed intrusion detection system using multi-agent AIS approach. Eng. Appl. Artif. Intell. 35: 286–298

    Article  Google Scholar 

  • Singh S R, Murthy H A and Gonsalves T A 2010 Feature selection for text classification based on gini coefficient of inequality. In: H Liu, H Motoda, R Setiono and Z Zhao (eds.), FSDM, volume 10 of JMLR Proceedings. JMLR.org

  • Siris V A and Papagalou F 2004 Application of anomaly detection algorithms for detecting SYN flooding attacks. In: Global Telecommunications Conference (GLOBECOM)

  • Siris V A and Papagalou F 2006 Application of anomaly detection algorithms for detecting syn flooding attacks. Comput. Commun. 29 (9): 1433–1442

    Article  Google Scholar 

  • Tax D M J and Duin R P W 2004 Support vector data description. Mach. Learn. 54 (1): 45–66. ISSN 0885-6125. URL http://dx.doi.org/10.1023/B:MACH.0000008084.60811.49

    Article  MATH  Google Scholar 

  • TCPDUMP 1999 http://www.tcpdump.org/

  • Thottan M and Ji C 1998 Proactive anomaly detection using distributed intelligent agents. Netw. IEEE 12 (5): 21–27

    Article  Google Scholar 

  • Thottan M and Ji C 2003 Anomaly detection in ip networks. IEEE Trans. Signal Process. 51 (8): 2191–2204. ISSN 1053-587X

    Article  Google Scholar 

  • Wang H, Zhang D and Shin K 2002a SYN-dog: Sniffing SYN flooding sources. In: Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS)

  • Wang H, Zhang D and Shin K G 2002b Syn-dog: Sniffing syn flooding sources. In: Distributed Computing Systems, 2002. Proceedings. 22nd International Conference on. IEEE

  • Wu Q and Shao Z 2005 Network anomaly detection using time series analysis. In: Autonomic and Autonomous Systems and International Conference on Networking and Services, 2005. ICAS-ICNS 2005. Joint International Conference on

  • Ye N, Vilbert S and Chen Q 2003 Computer intrusion detection through ewma for autocorrelated and uncorrelated data. Reliability, IEEE Transactions on 52 (1): 75–82

    Article  Google Scholar 

Download references

Acknowledgements

We thank Mr. A Ramasamy for contributing resources from his thesis for this work. This work was carried out under the IU-ATC project funded by the Department of Science and Technology (DST), Government of India and the UK EPSRC Digital Economy Programme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to SAAD Y SAIT.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

SAIT, S.Y., BHANDARI, A., KHARE, S. et al. Multi-level anomaly detection: Relevance of big data analytics in networks. Sadhana 40, 1737–1767 (2015). https://doi.org/10.1007/s12046-015-0416-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12046-015-0416-0

Keywords

Navigation