Abstract
Cloud computing is an Internet based computing environment, where storage and computing resources are assigned dynamically among users according to their needs, using the virtualization technology. Virtualization is an underlying infrastructure of cloud computing, and has led to certain security problems during the development of cloud computing. One essential but formidable task in cloud computing is to detect malicious attacks and their types. Due to increasing incidents of cyber-attacks, design and implementation of effective intrusion detection systems to protect the security of information systems is crucial. In this paper, a host-based intrusion detection system (H-IDS) for protecting virtual machines in the cloud environment is proposed. To this end, first, important features of each class are selected using logistic regression and next, these values are improved using the regularization technique. Then, various attacks are classified using a combination of three different classifiers: neural network, decision tree and linear discriminate analysis with the bagging algorithm for each class. The proposed model has been trained and tested using the NSL-KDD data set with an implementation in the Cloudsim software. Simulation results compared to other methods shows acceptable accuracy of about 97.51 for detecting attacks against normal states.
Similar content being viewed by others
Notes
Intelligent agent based enhanced multiclass support vector machine (IAEMSVM).
Pruning algorithm rule-based classification tree (PART).
References
Alpaydin E (2004) Introduction to machine learning. MIT Press, Cambridge
Alqahtani SM, Balushi MA, John R (2014) An intelligent intrusion detection system for cloud computing (SIDSCC). In: International conference on computational science and computational intelligence, Las Vegas, March 10–13. https://doi.org/10.1109/CSCI.2014.108
Amor NB, Benferhat S, Elouedi Z(2004) Naive Bayes vs decision trees in intrusion detection systems. In: Proceedings of the 2004 ACM symposium on applied computing, Nicosia, pp 420–424. https://doi.org/10.1145/967900.967989
Aygun RC, Yavuz AG (2017) Network anomaly detection with stochastically improved autoencoder based models. In: IEEE 4th international conference on cyber security and cloud computing (CSCloud), New York, pp 193–198. https://doi.org/10.1109/CSCloud.2017.39
Benzidane K, Khoudali S, Sekkaki A (2013) Secured architecture for inter-VM traffic in a Cloud environment. In: 2nd IEEE Latin American conference on cloud computing and communications, Maceio, Dec 9–10, pp 23–28. https://doi.org/10.1109/LatinCloud.2013.6842218
Bhat A, Patra S, Jena D (2013) Machine learning approach for intrusion detection on cloud virtual machines. Int J Appl Innov Eng Manag (IJAIEM) 2(6):57–66
Bi M, Xu J, Wang M, Zhou F (2016) Anomaly detection model of user behavior based on principle component analysis. J Ambient Intell Humaniz Comput 7(4):547–554. https://doi.org/10.1007/s12652-015-0341-4
Büchlmann P, Bin Y (2002) Analyzing Bagging. Ann Stat 30(4):927–961
Cloudsim simulator (2015) http://www.cloudbus.org/cloudsim
Deshpande P, Sharma SC, Peddoju SK, Junaid S (2018) HIDS: a host based intrusion detection system for cloud computing environment. Int J Syst Assur Eng Manag 9(3):567–576. https://doi.org/10.1007/s13198-014-0277-7
Dhanabal L, Shantharajah DS (2015) A study on NSL-KDD dataset for intrusion detection system based on classification algorithms. Int J Adv Res Comput Commun Eng 4(6):446–452
El-Koka A, Cha KH, Kang DK (2013) Regularization parameter tuning optimization approach in logistic regression. In:15th international conference on advanced communications technology (ICACT), 27–30 Jan, Pyeong Chang, pp 13–18
Garfinkel T, Rosenblum M (2005) When virtual is harder than real: security challenges in virtual machine based computing environments. In: 10th workshop on hot topics in operating systems (HOTOS’05), Santa Fe, June 12–15, pp 20–25
Ghosh P, Mandal AK, Kumar R (2015) An efficient cloud network intrusion detection system. Inf Syst Des Intell Appl 1:91–99. https://doi.org/10.1007/978-81-322-2250-7_10
Gorelik E (2013) Cloud computing models. M.Sc. thesis, Massachusetts Institute of Technology
Jin H, Xiang G, Zou D, Wu S, Zhoa F, Li M (2013) A VMM-based intrusion prevention system in cloud computing environment. J Supercomput 66(3):1133–1151. https://doi.org/10.1007/s11227-011-0608-2
Kannan A, Maguire GQ, Sharma A, Schoo P (2012) Genetic algorithm based feature selection algorithm for effective intrusion detection in cloud networks. In: IEEE 12th international conference on data mining workshops, Brussels, 10 Dec. https://doi.org/10.1109/ICDMW.2012.56
Khorshed MT, Ali AS, Wasimi SA(2011) Monitoring insiders activities in cloud computing using rule based learning. In IEEE 10th international conference on trust, security and privacy in computing and communications, Changsha, Nov 16–18. https://doi.org/10.1109/TrustCom.2011.99
Langin C, Rahimi S (2010) Soft computing in intrusion detection: the state of the art. J Ambient Intell Humaniz Comput 1(2):134–145. https://doi.org/10.1007/s12652-010-0012-4
Li Z, Sun W, Wang L (2012) A neural network based distributed intrusion detection system on cloud platform. In: IEEE 2nd international conference on cloud computing and intelligence systems, Hangzhou, 30 Oct–1 Nov. https://doi.org/10.1109/CCIS.2012.6664371
Loog M(1999) Approximate pairwise accuracy criteria for multiclass linear dimension reduction: generalisations of the fisher criterion. Delft University Press, The Netherlands
Loukas G, Vuong T, Heartfield R, Sakellari G, Yoon Y, Gan D (2018) Cloud-based cyber-physical intrusion detection for vehicles using deep learning. IEEE Access 6:3491–3508. https://doi.org/10.1109/ACCESS.2017.2782159
Mahmood Z, Agrawal C, Hasan SS, Zenab S (2012) Intrusion detection in cloud computing environment using neural network. Int J Res Comput Eng Electron 1(1):19–22
Modi CN, Patel DR, Patel A, Rajarajan M (2012) Integrating signature Apriori based network intrusion detection system (NIDS) in cloud computing. Proc Technol 6:905–912. https://doi.org/10.1016/j.protcy.2012.10.110
Muche EW (2016) Hybrid intrusion detection system for private cloud: an integrated approach. M.Sc. thesis, Bahir Dar University
Murphy KP (2012) Machine learning, a probabilistic perspective. MIT Press, Cambridge
Muthurajkumar S, Ganapathy S, Vijayalakshmi M, Kannan A (2015) An effective intrusion detection on cloud virtual machines using hybrid feature selection and multiclass classifier. Aust J Basic Appl Sci 9(6):38–41
Nagarajan P, Perumal G (2015) A neuro fuzzy based intrusion detection system for a cloud data center using adaptive learning. Cybern Inf Technol 15(3):88–103. https://doi.org/10.1515/cait-2015-0043
Nguyen KK, Hoang DT, Niyato D, Wang P, Nguyen D, Dutkiewicz E (2018) Cyberattack detection in mobile cloud computing: a deep learning approach. In: IEEE wireless communications and networking conference (WCNC), 15–18 April, Barcelona, pp 1–6. https://doi.org/10.1109/WCNC.2018.8376973
NSL-KDD dataset (2015) http://nsl.cs.unb.ca/nsl-kdd
Padmakumari P, Surendra K, Sowmya M, Sravya M (2014) Effective intrusion detection system for cloud architecture. ARPN J Eng Appl Sci 9(11):2135–2139
Panov P, Džeroski S (2007) Combining bagging and random subspaces to create better ensembles. In: International symposium on intelligent data analysis, advances in intelligent data analysis VII, pp 118–129. https://doi.org/10.1007/978-3-540-74825-0_11
Park ST, Li G, Hong JC (2018) A study on smart factory-based ambient intelligence context-aware intrusion detection system using machine learning. J Ambient Intell Human Computi. https://doi.org/10.1007/s12652-018-0998-6
Potteti S, Parati N (2015) Hybrid intrusion detection architecture for cloud environment. Int J Eng Comput Sci 4(5):12146–12151
Pratik PJ, Madhu BR (2013) Data mining based CIDS: Cloud intrusion detection system for masquerade attacks [DCIDSM]. In: 4th international conference on computing, communications and networking technologies (ICCCNT), Tiruchengode, July 4–6. https://doi.org/10.1109/ICCCNT.2013.6726497
Precup D’s Homepage (2018) Machine learning course. https://www.cs.mcgill.ca/~dprecup/courses/ML/Lectures/ml-lecture05.pdf
Saad EN, Mahdi KE, Zbakh M (2012) Cloud computing architectures based IDS. In: International conference on complex system (ICCS), Rabat, pp 1–6. https://doi.org/10.1109/ICoCS.2012.6458581
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD CUP 99 data set. In: 2nd IEEE symposium on computational intelligence for security and defence applications, Ottawa, July 8–10. https://doi.org/10.1109/CISDA.2009.5356528
Welling M (2005) Fisher linear discriminant analysis, vol 3, no 1. Department of Computer Science University of Toronto
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
To prove the convexity of the cost function of logistic regression, we need to first define the following preliminaries (Precup 2018):
A function f:Rd→R is convex if for all a, b ∈ Rd, λ ∈ [0, 1]:
If f and g are convex functions, αf + βg is also convex for any real numbers α and β.
In terms of convexity characterization, there are two types of convexity:
-
First-order characterization:
F is convex ⇔ for all a, b: f(a) ≥ f(b) + ∇f(b)T(a − b)
(the function is globally above the tangent at b).
-
Second-order characterization:
F is convex ⇔ the Hessian of f is positive semi-definite.
The Hessian contains the second-order derivatives of f: Hi,j = ∂2f/∂xi∂xj.
It is positive semi-definite if aTHa ≥ 0 for all a ∈ Rd.
According to these definitions we can now prove the convexity of the cost function. The cost function of logistic regression is:
where σ(z) = 1/(1 + e− z) (check that σ′(z) = σ(z)(1 − σ(z))).
We show that –log σ(wTx) and − log(1 − σ(wTx)) are convex in w:
⇒ It is easy to check that this matrix is positive semi-definite for any x.
Similarly you can show that:
⇒ J(w) is convex in w.
⇒ The gradient of J is XT(ˆy − y) where ˆyi = σ(wTxi) = h(xi).
⇒ The Hessian of J is XTRX where R is diagonal with entries Ri,i = h(xi)(1 − h(xi)).
Appendix 2
Diagrams of the Gaussian normal function of all 41 features of the NSL-KDD data set.
Rights and permissions
About this article
Cite this article
Besharati, E., Naderan, M. & Namjoo, E. LR-HIDS: logistic regression host-based intrusion detection system for cloud environments. J Ambient Intell Human Comput 10, 3669–3692 (2019). https://doi.org/10.1007/s12652-018-1093-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-018-1093-8