Skip to main content
Log in

LR-HIDS: logistic regression host-based intrusion detection system for cloud environments

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Cloud computing is an Internet based computing environment, where storage and computing resources are assigned dynamically among users according to their needs, using the virtualization technology. Virtualization is an underlying infrastructure of cloud computing, and has led to certain security problems during the development of cloud computing. One essential but formidable task in cloud computing is to detect malicious attacks and their types. Due to increasing incidents of cyber-attacks, design and implementation of effective intrusion detection systems to protect the security of information systems is crucial. In this paper, a host-based intrusion detection system (H-IDS) for protecting virtual machines in the cloud environment is proposed. To this end, first, important features of each class are selected using logistic regression and next, these values are improved using the regularization technique. Then, various attacks are classified using a combination of three different classifiers: neural network, decision tree and linear discriminate analysis with the bagging algorithm for each class. The proposed model has been trained and tested using the NSL-KDD data set with an implementation in the Cloudsim software. Simulation results compared to other methods shows acceptable accuracy of about 97.51 for detecting attacks against normal states.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Intelligent agent based enhanced multiclass support vector machine (IAEMSVM).

  2. Pruning algorithm rule-based classification tree (PART).

References

  • Alpaydin E (2004) Introduction to machine learning. MIT Press, Cambridge

    MATH  Google Scholar 

  • Alqahtani SM, Balushi MA, John R (2014) An intelligent intrusion detection system for cloud computing (SIDSCC). In: International conference on computational science and computational intelligence, Las Vegas, March 10–13. https://doi.org/10.1109/CSCI.2014.108

  • Amor NB, Benferhat S, Elouedi Z(2004) Naive Bayes vs decision trees in intrusion detection systems. In: Proceedings of the 2004 ACM symposium on applied computing, Nicosia, pp 420–424. https://doi.org/10.1145/967900.967989

  • Aygun RC, Yavuz AG (2017) Network anomaly detection with stochastically improved autoencoder based models. In: IEEE 4th international conference on cyber security and cloud computing (CSCloud), New York, pp 193–198. https://doi.org/10.1109/CSCloud.2017.39

  • Benzidane K, Khoudali S, Sekkaki A (2013) Secured architecture for inter-VM traffic in a Cloud environment. In: 2nd IEEE Latin American conference on cloud computing and communications, Maceio, Dec 9–10, pp 23–28. https://doi.org/10.1109/LatinCloud.2013.6842218

  • Bhat A, Patra S, Jena D (2013) Machine learning approach for intrusion detection on cloud virtual machines. Int J Appl Innov Eng Manag (IJAIEM) 2(6):57–66

    Google Scholar 

  • Bi M, Xu J, Wang M, Zhou F (2016) Anomaly detection model of user behavior based on principle component analysis. J Ambient Intell Humaniz Comput 7(4):547–554. https://doi.org/10.1007/s12652-015-0341-4

    Article  Google Scholar 

  • Büchlmann P, Bin Y (2002) Analyzing Bagging. Ann Stat 30(4):927–961

    Article  MathSciNet  MATH  Google Scholar 

  • Cloudsim simulator (2015) http://www.cloudbus.org/cloudsim

  • Deshpande P, Sharma SC, Peddoju SK, Junaid S (2018) HIDS: a host based intrusion detection system for cloud computing environment. Int J Syst Assur Eng Manag 9(3):567–576. https://doi.org/10.1007/s13198-014-0277-7

    Article  Google Scholar 

  • Dhanabal L, Shantharajah DS (2015) A study on NSL-KDD dataset for intrusion detection system based on classification algorithms. Int J Adv Res Comput Commun Eng 4(6):446–452

    Google Scholar 

  • El-Koka A, Cha KH, Kang DK (2013) Regularization parameter tuning optimization approach in logistic regression. In:15th international conference on advanced communications technology (ICACT), 27–30 Jan, Pyeong Chang, pp 13–18

  • Garfinkel T, Rosenblum M (2005) When virtual is harder than real: security challenges in virtual machine based computing environments. In: 10th workshop on hot topics in operating systems (HOTOS’05), Santa Fe, June 12–15, pp 20–25

  • Ghosh P, Mandal AK, Kumar R (2015) An efficient cloud network intrusion detection system. Inf Syst Des Intell Appl 1:91–99. https://doi.org/10.1007/978-81-322-2250-7_10

    Article  Google Scholar 

  • Gorelik E (2013) Cloud computing models. M.Sc. thesis, Massachusetts Institute of Technology

  • Jin H, Xiang G, Zou D, Wu S, Zhoa F, Li M (2013) A VMM-based intrusion prevention system in cloud computing environment. J Supercomput 66(3):1133–1151. https://doi.org/10.1007/s11227-011-0608-2

    Article  Google Scholar 

  • Kannan A, Maguire GQ, Sharma A, Schoo P (2012) Genetic algorithm based feature selection algorithm for effective intrusion detection in cloud networks. In: IEEE 12th international conference on data mining workshops, Brussels, 10 Dec. https://doi.org/10.1109/ICDMW.2012.56

  • Khorshed MT, Ali AS, Wasimi SA(2011) Monitoring insiders activities in cloud computing using rule based learning. In IEEE 10th international conference on trust, security and privacy in computing and communications, Changsha, Nov 16–18. https://doi.org/10.1109/TrustCom.2011.99

  • Langin C, Rahimi S (2010) Soft computing in intrusion detection: the state of the art. J Ambient Intell Humaniz Comput 1(2):134–145. https://doi.org/10.1007/s12652-010-0012-4

    Article  Google Scholar 

  • Li Z, Sun W, Wang L (2012) A neural network based distributed intrusion detection system on cloud platform. In: IEEE 2nd international conference on cloud computing and intelligence systems, Hangzhou, 30 Oct–1 Nov. https://doi.org/10.1109/CCIS.2012.6664371

  • Loog M(1999) Approximate pairwise accuracy criteria for multiclass linear dimension reduction: generalisations of the fisher criterion. Delft University Press, The Netherlands

  • Loukas G, Vuong T, Heartfield R, Sakellari G, Yoon Y, Gan D (2018) Cloud-based cyber-physical intrusion detection for vehicles using deep learning. IEEE Access 6:3491–3508. https://doi.org/10.1109/ACCESS.2017.2782159

    Article  Google Scholar 

  • Mahmood Z, Agrawal C, Hasan SS, Zenab S (2012) Intrusion detection in cloud computing environment using neural network. Int J Res Comput Eng Electron 1(1):19–22

    Google Scholar 

  • Modi CN, Patel DR, Patel A, Rajarajan M (2012) Integrating signature Apriori based network intrusion detection system (NIDS) in cloud computing. Proc Technol 6:905–912. https://doi.org/10.1016/j.protcy.2012.10.110

    Article  Google Scholar 

  • Muche EW (2016) Hybrid intrusion detection system for private cloud: an integrated approach. M.Sc. thesis, Bahir Dar University

  • Murphy KP (2012) Machine learning, a probabilistic perspective. MIT Press, Cambridge

    MATH  Google Scholar 

  • Muthurajkumar S, Ganapathy S, Vijayalakshmi M, Kannan A (2015) An effective intrusion detection on cloud virtual machines using hybrid feature selection and multiclass classifier. Aust J Basic Appl Sci 9(6):38–41

    Google Scholar 

  • Nagarajan P, Perumal G (2015) A neuro fuzzy based intrusion detection system for a cloud data center using adaptive learning. Cybern Inf Technol 15(3):88–103. https://doi.org/10.1515/cait-2015-0043

    Article  Google Scholar 

  • Nguyen KK, Hoang DT, Niyato D, Wang P, Nguyen D, Dutkiewicz E (2018) Cyberattack detection in mobile cloud computing: a deep learning approach. In: IEEE wireless communications and networking conference (WCNC), 15–18 April, Barcelona, pp 1–6. https://doi.org/10.1109/WCNC.2018.8376973

  • NSL-KDD dataset (2015) http://nsl.cs.unb.ca/nsl-kdd

  • Padmakumari P, Surendra K, Sowmya M, Sravya M (2014) Effective intrusion detection system for cloud architecture. ARPN J Eng Appl Sci 9(11):2135–2139

    Google Scholar 

  • Panov P, Džeroski S (2007) Combining bagging and random subspaces to create better ensembles. In: International symposium on intelligent data analysis, advances in intelligent data analysis VII, pp 118–129. https://doi.org/10.1007/978-3-540-74825-0_11

  • Park ST, Li G, Hong JC (2018) A study on smart factory-based ambient intelligence context-aware intrusion detection system using machine learning. J Ambient Intell Human Computi. https://doi.org/10.1007/s12652-018-0998-6

    Article  Google Scholar 

  • Potteti S, Parati N (2015) Hybrid intrusion detection architecture for cloud environment. Int J Eng Comput Sci 4(5):12146–12151

    Google Scholar 

  • Pratik PJ, Madhu BR (2013) Data mining based CIDS: Cloud intrusion detection system for masquerade attacks [DCIDSM]. In: 4th international conference on computing, communications and networking technologies (ICCCNT), Tiruchengode, July 4–6. https://doi.org/10.1109/ICCCNT.2013.6726497

  • Precup D’s Homepage (2018) Machine learning course. https://www.cs.mcgill.ca/~dprecup/courses/ML/Lectures/ml-lecture05.pdf

  • Saad EN, Mahdi KE, Zbakh M (2012) Cloud computing architectures based IDS. In: International conference on complex system (ICCS), Rabat, pp 1–6. https://doi.org/10.1109/ICoCS.2012.6458581

  • Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD CUP 99 data set. In: 2nd IEEE symposium on computational intelligence for security and defence applications, Ottawa, July 8–10. https://doi.org/10.1109/CISDA.2009.5356528

  • Welling M (2005) Fisher linear discriminant analysis, vol 3, no 1. Department of Computer Science University of Toronto

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marjan Naderan.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

To prove the convexity of the cost function of logistic regression, we need to first define the following preliminaries (Precup 2018):

A function f:Rd→R is convex if for all a, b ∈ Rd, λ ∈ [0, 1]:

$${\text{f}}\left( {\lambda {\text{a}}+\left( {{\text{1}} - \lambda } \right){\text{b}}} \right) \leq \lambda {\text{f}}\left( {\text{a}} \right)+\left( {{\text{1}} - \lambda } \right){\text{f}}\left( {\text{b}} \right)$$

If f and g are convex functions, αf + βg is also convex for any real numbers α and β.

In terms of convexity characterization, there are two types of convexity:

  • First-order characterization:

F is convex ⇔ for all a, b: f(a) ≥ f(b) + ∇f(b)T(a − b)

(the function is globally above the tangent at b).

  • Second-order characterization:

F is convex ⇔ the Hessian of f is positive semi-definite.

The Hessian contains the second-order derivatives of f: Hi,j = ∂2f/∂xi∂xj.

It is positive semi-definite if aTHa ≥ 0 for all a ∈ Rd.

According to these definitions we can now prove the convexity of the cost function. The cost function of logistic regression is:

$${\text{J}}\left( {\text{w}} \right)= - \left( {{\sum _{{\text{i}}={\text{1}}}}^{{\text{m}}}{{\text{y}}_{\text{i}}}{\text{log }}\sigma \left( {{{\text{w}}^{\text{T}}}{{\text{x}}_{\text{i}}}} \right)+\left( {{\text{1}} - {{\text{y}}_{\text{i}}}} \right){\text{ log}}\left( {{\text{1}} - \sigma \left( {{{\text{w}}^{\text{T}}}{{\text{x}}_{\text{i}}}} \right)} \right)} \right)$$

where σ(z) = 1/(1 + e− z) (check that σ′(z) = σ(z)(1 − σ(z))).

We show that –log σ(wTx) and − log(1 − σ(wTx)) are convex in w:

$$\begin{aligned} \nabla {\text{w}}\left( { - {\text{log }}\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)} \right) & ~= - \nabla {\text{w}}\left( {\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)} \right)/\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)~~~ \\ ~~ & = - \sigma \prime \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)\nabla {\text{w}}\left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)/\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)~~~ \\ ~~ & =\left( {\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right) - {\text{1}}} \right){\text{x}} \\ {\nabla ^{\text{2}}}_{{\text{w}}}\left( { - {\text{log }}\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)} \right)~\, & ={\nabla _{\text{w}}}\left( {\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right){\text{x}}} \right)~~~~ \\ ~ & =\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)\left( {{\text{1}} - \sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)} \right){\text{x}}{{\text{x}}^{\text{T}}} \\ \end{aligned}$$

⇒ It is easy to check that this matrix is positive semi-definite for any x.

Similarly you can show that:

$$\begin{aligned} {\nabla _{\text{w}}}\left( { - {\text{log}}\left( {{\text{1}} - \sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)} \right)} \right) & =\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right){\text{x}} \\ {\nabla ^{\text{2}}}_{{\text{w}}}\left( { - {\text{log}}\left( {{\text{1}} - \sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)} \right)} \right) & =\sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)\left( {{\text{1}} - \sigma \left( {{{\text{w}}^{\text{T}}}{\text{x}}} \right)} \right){\text{x}}{{\text{x}}^{\text{T}}} \\ \end{aligned}$$

⇒ J(w) is convex in w.

⇒ The gradient of J is XT(ˆy − y) where ˆyi = σ(wTxi) = h(xi).

⇒ The Hessian of J is XTRX where R is diagonal with entries Ri,i = h(xi)(1 − h(xi)).

Appendix 2

Diagrams of the Gaussian normal function of all 41 features of the NSL-KDD data set.

figure b
figure c
figure d

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Besharati, E., Naderan, M. & Namjoo, E. LR-HIDS: logistic regression host-based intrusion detection system for cloud environments. J Ambient Intell Human Comput 10, 3669–3692 (2019). https://doi.org/10.1007/s12652-018-1093-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-018-1093-8

Keywords

Navigation