Abstract
An intrusion detection system has become a vital mechanism to detect a wide variety of malicious activities in the cyber domain. However, this system still faces an important limitation when it comes to detecting zero-day attacks, concerning the reduction of relatively high false alarm rates. It is thus necessary to no longer consider the tasks of monitoring and analysing network data in isolation, but instead optimise their integration with decision-making methods for identifying anomalous events. This chapter presents a scalable framework for building an effective and lightweight anomaly detection system. This framework includes three modules of capturing and logging, pre-processing and a new statistical decision engine, called the Dirichlet mixture model based anomaly detection technique. The first module sniffs and collects network data while the second module analyses and filters these data to improve the performance of the decision engine. Finally, the decision engine is designed based on the Dirichlet mixture model with a lower-upper interquartile range as decision engine. The performance of this framework is evaluated on two well-known datasets, the NSL-KDD and UNSW-NB15. The empirical results showed that the statistical analysis of network data helps in choosing the best model which correctly fits the network data. Additionally, the Dirichlet mixture model based anomaly detection technique provides a higher detection rate and lower false alarm rate than other three compelling techniques. These techniques were built based on correlation and distance measures that cannot detect modern attacks which mimic normal activities, whereas the proposed technique was established using the Dirichlet mixture model and precise boundaries of interquartile range for finding small differences between legitimate and attack vectors, efficiently identifying these attacks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The NSLKDD dataset, https://web.archive.org/web/20150205070216/http://nsl.cs.unb.ca/NSL-KDD/, November 2016.
- 2.
The UNSW-NB15 dataset, https://www.unsw.adfa.edu.au/australian-centre-for-cyber-security/cybersecurity/ADFA-NB15-Datasets/, November 2016.
- 3.
The IXIA Tools, https://www.ixiacom.com/products/perfectstorm, November 2016.
- 4.
Pcap refers to packet capture, which contains an Application Programming Interface (API) for saving network data. The UNIX operating systems execute the pcap format using the libpcap library while the Windows operating systems utilise a port of libpcap, called WinPcap.
- 5.
The MySQL Cluster CGE technology, https://www.mysql.com/products/cluster/, November 2016.
- 6.
The Hadoop technologies, http://hadoop.apache.org/, November 2016.
References
Aburomman, A.A., Reaz, M.B.I.: A novel svm-knn-pso ensemble method for intrusion detection system. Applied Soft Computing 38, 360–372 (2016)
Ahmed, M., Mahmood, A.N., Hu, J.: A survey of network anomaly detection techniques. Journal of Network and Computer Applications 60, 19–31 (2016)
Alqahtani, S.M., Al Balushi, M., John, R.: An intelligent intrusion detection system for cloud computing (sidscc). In: Computational Science and Computational Intelligence (CSCI), 2014 International Conference on, vol. 2, pp. 135–141. IEEE (2014)
Ambusaidi, M., He, X., Nanda, P., Tan, Z.: Building an intrusion detection system using a filter-based feature selection algorithm (2016)
traffic analysis, N.: Network traffic analysis (November 2016). URL https://www.ipswitch.com/solutions/network-traffic-analysis
Berthier, R., Sanders, W.H., Khurana, H.: Intrusion detection for advanced metering infrastructures: Requirements and architectural directions. In: Smart Grid Communications (SmartGridComm), 2010 First IEEE International Conference on, pp. 350–355. IEEE (2010)
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Communications Surveys & Tutorials 16(1), 303–336 (2014)
Bouguila, N., Ziou, D., Vaillancourt, J.: Unsupervised learning of a finite mixture model based on the dirichlet distribution and its application. IEEE Transactions on Image Processing 13(11), 1533–1543 (2004)
Boutemedjet, S., Bouguila, N., Ziou, D.: A hybrid feature extraction selection approach for high-dimensional non-gaussian data clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(8), 1429–1443 (2009)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM computing surveys (CSUR) 41(3), 15 (2009)
Corona, I., Giacinto, G., Roli, F.: Adversarial attacks against intrusion detection systems: Taxonomy, solutions and open issues. Information Sciences 239, 201–225 (2013)
Ding, Q., Kolaczyk, E.D.: A compressed pca subspace method for anomaly detection in high-dimensional data. IEEE Transactions on Information Theory 59(11), 7419–7433 (2013)
Dua, S., Du, X.: Data mining and machine learning in cybersecurity. CRC press (2016)
Dubey, S., Dubey, J.: Kbb: A hybrid method for intrusion detection. In: Computer, Communication and Control (IC4), 2015 International Conference on, pp. 1–6. IEEE (2015)
Escobar, M.D., West, M.: Bayesian density estimation and inference using mixtures. Journal of the american statistical association 90(430), 577–588 (1995)
Fahad, A., Tari, Z., Almalawi, A., Goscinski, A., Khalil, I., Mahmood, A.: Ppfscada: Privacy preserving framework for scada data publishing. Future Generation Computer Systems 37, 496–511 (2014)
Fan, W., Bouguila, N., Ziou, D.: Unsupervised anomaly intrusion detection via localized bayesian feature selection. In: 2011 IEEE 11th International Conference on Data Mining, pp. 1032–1037. IEEE (2011)
Fan, W., Bouguila, N., Ziou, D.: Variational learning for finite dirichlet mixture models and applications. IEEE transactions on neural networks and learning systems 23(5), 762–774 (2012)
Ghasemi, A., Zahediasl, S., et al.: Normality tests for statistical analysis: a guide for non-statisticians. International journal of endocrinology and metabolism 10(2), 486–489 (2012)
Giannetsos, T., Dimitriou, T.: Spy-sense: spyware tool for executing stealthy exploits against sensor networks. In: Proceedings of the 2nd ACM workshop on Hot topics on wireless network security and privacy, pp. 7–12. ACM (2013)
Greggio, N.: Learning anomalies in idss by means of multivariate finite mixture models. In: Advanced Information Networking and Applications (AINA), 2013 IEEE 27th International Conference on, pp. 251–258. IEEE (2013)
Harrou, F., Kadri, F., Chaabane, S., Tahon, C., Sun, Y.: Improved principal component analysis for anomaly detection: Application to an emergency department. Computers & Industrial Engineering 88, 63–77 (2015)
Horng, S.J., Su, M.Y., Chen, Y.H., Kao, T.W., Chen, R.J., Lai, J.L., Perkasa, C.D.: A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert systems with Applications 38(1), 306–313 (2011)
Hung, S.S., Liu, D.S.M.: A user-oriented ontology-based approach for network intrusion detection. Computer Standards & Interfaces 30(1), 78–88 (2008)
Jadhav, A., Jadhav, A., Jadhav, P., Kulkarni, P.: A novel approach for the design of network intrusion detection system (nids). In: Sensor Network Security Technology and Privacy Communication System (SNS & PCS), 2013 International Conference on, pp. 22–27. IEEE (2013)
Lee, Y.J., Yeh, Y.R., Wang, Y.C.F.: Anomaly detection via online oversampling principal component analysis. IEEE Transactions on Knowledge and Data Engineering 25(7), 1460–1470 (2013)
Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE transactions on pattern analysis and machine intelligence 36(1), 18–32 (2014)
Milenkoski, A., Vieira, M., Kounev, S., Avritzer, A., Payne, B.D.: Evaluating computer intrusion detection systems: A survey of common practices. ACM Computing Surveys (CSUR) 48(1), 12 (2015)
Minka, T.: Estimating a dirichlet distribution (2000)
Modi, C., Patel, D., Borisaniya, B., Patel, H., Patel, A., Rajarajan, M.: A survey of intrusion detection techniques in cloud. Journal of Network and Computer Applications 36(1), 42–57 (2013)
Moustafa, N., Slay, J.: A hybrid feature selection for network intrusion detection systems: Central points. In: the Proceedings of the 16th Australian Information Warfare Conference, Edith Cowan University, Joondalup Campus, Perth, Western Australia, pp. 5–13. Security Research Institute, Edith Cowan University (2015)
Moustafa, N., Slay, J.: The significant features of the unsw-nb15 and the kdd99 data sets for network intrusion detection systems. In: Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS), 2015 4th International Workshop on, pp. 25–31. IEEE (2015)
Moustafa, N., Slay, J.: Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: Military Communications and Information Systems Conference (MilCIS), 2015, pp. 1–6. IEEE (2015)
Moustafa, N., Slay, J.: The evaluation of network anomaly detection systems: Statistical analysis of the unsw-nb15 data set and the comparison with the kdd99 data set. Information Security Journal: A Global Perspective (2016)
Nadiammai, G., Hemalatha, M.: An evaluation of clustering technique over intrusion detection system. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics, pp. 1054–1060. ACM (2012)
Naldurg, P., Sen, K., Thati, P.: A temporal logic based framework for intrusion detection. In: International Conference on Formal Techniques for Networked and Distributed Systems, pp. 359–376. Springer (2004)
Perdisci, R., Gu, G., Lee, W.: Using an ensemble of one-class svm classifiers to harden payload-based anomaly detection systems. In: Sixth International Conference on Data Mining (ICDM’06), pp. 488–498. IEEE (2006)
Pontarelli, S., Bianchi, G., Teofili, S.: Traffic-aware design of a high-speed fpga network intrusion detection system. IEEE Transactions on Computers 62(11), 2322–2334 (2013)
Ranshous, S., Shen, S., Koutra, D., Harenberg, S., Faloutsos, C., Samatova, N.F.: Anomaly detection in dynamic networks: a survey. Wiley Interdisciplinary Reviews: Computational Statistics 7(3), 223–247 (2015)
Rousseeuw, P.J., Hubert, M.: Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1(1), 73–79 (2011)
Saligrama, V., Chen, Z.: Video anomaly detection based on local statistical aggregates. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 2112–2119. IEEE (2012)
Seeberg, V.E., Petrovic, S.: A new classification scheme for anonymization of real data used in ids benchmarking. In: Availability, Reliability and Security, 2007. ARES 2007. The Second International Conference on, pp. 385–390. IEEE (2007)
Shameli-Sendi, A., Cheriet, M., Hamou-Lhadj, A.: Taxonomy of intrusion risk assessment and response system. Computers & Security 45, 1–16 (2014)
Sheikhan, M., Jadidi, Z.: Flow-based anomaly detection in high-speed links using modified gsa-optimized neural network. Neural Computing and Applications 24(3–4), 599–611 (2014)
Shifflet, J.: A technique independent fusion model for network intrusion detection. In: Proceedings of the Midstates Conference on Undergraduate Research in Computer Science and Mat hematics, vol. 3, pp. 1–3. Citeseer (2005)
Tan, Z., Jamdagni, A., He, X., Nanda, P., Liu, R.P.: Denial-of-service attack detection based on multivariate correlation analysis. In: International Conference on Neural Information Processing, pp. 756–765. Springer (2011)
Tan, Z., Jamdagni, A., He, X., Nanda, P., Liu, R.P.: A system for denial-of-service attack detection based on multivariate correlation analysis. IEEE transactions on parallel and distributed systems 25(2), 447–456 (2014)
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the kdd cup 99 data set. In: Proceedings of the Second IEEE Symposium on Computational Intelligence for Security and Defence Applications 2009 (2009)
Tsai, C.F., Lin, C.Y.: A triangle area based nearest neighbors approach to intrusion detection. Pattern recognition 43(1), 222–229 (2010)
Wagle, B.: Multivariate beta distribution and a test for multivariate normality. Journal of the Royal Statistical Society. Series B (Methodological) pp. 511–516 (1968)
Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection systems: A review. Applied Soft Computing 10(1), 1–35 (2010)
Zainaddin, D.A.A., Hanapi, Z.M.: Hybrid of fuzzy clustering neural network over nsl dataset for intrusion detection system. Journal of Computer Science 9(3), 391 (2013)
Zuech, R., Khoshgoftaar, T.M., Wald, R.: Intrusion detection and big heterogeneous data: a survey. Journal of Big Data 2(1), 1 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Moustafa, N., Creech, G., Slay, J. (2017). Big Data Analytics for Intrusion Detection System: Statistical Decision-Making Using Finite Dirichlet Mixture Models. In: Palomares Carrascosa, I., Kalutarage, H., Huang, Y. (eds) Data Analytics and Decision Support for Cybersecurity. Data Analytics. Springer, Cham. https://doi.org/10.1007/978-3-319-59439-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-59439-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59438-5
Online ISBN: 978-3-319-59439-2
eBook Packages: Computer ScienceComputer Science (R0)