RAMD: registry-based anomaly malware detection using one-class ensemble classifiers

Tajoddin, Asghar; Abadi, Mahdi

doi:10.1007/s10489-018-01405-0

RAMD: registry-based anomaly malware detection using one-class ensemble classifiers

Published: 29 January 2019

Volume 49, pages 2641–2658, (2019)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

944 Accesses
24 Citations
Explore all metrics

Abstract

Malware is continuously evolving and becoming more sophisticated to avoid detection. Traditionally, the Windows operating system has been the most popular target for malware writers because of its dominance in the market of desktop operating systems. However, despite a large volume of new Windows malware samples that are collected daily, there is relatively little research focusing on Windows malware. The Windows Registry, or simply the registry, is very heavily used by programs in Windows, making it a good source for detecting malicious behavior. In this paper, we present RAMD, a novel approach that uses an ensemble classifier consisting of multiple one-class classifiers to detect known and especially unknown malware abusing registry keys and values for malicious intent. RAMD builds a model of registry behavior of benign programs and then uses this model to detect malware by looking for anomalous registry accesses. In detail, it constructs an initial ensemble classifier by training multiple one-class classifiers and then applies a novel swarm intelligence pruning algorithm, called memetic firefly-based ensemble classifier pruning (MFECP), on the ensemble classifier to reduce its size by selecting only a subset of one-class classifiers that are highly accurate and have diversity in their outputs. To combine the outputs of one-class classifiers in the pruned ensemble classifier, RAMD uses a specific aggregation operator, called Fibonacci-based superincreasing ordered weighted averaging (FSOWA). The results of our experiments performed on a dataset of benign and malware samples show that RAMD can achieve about 98.52% detection rate, 2.19% false alarm rate, and 98.43% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 6

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

A survey on ensemble learning

Article 30 August 2019

References

Abbas H, Yasin M, Ahmed F, Sajid A, Khan FA, Ashfaq RAR, Haldar NAH (2016) Forensic artifacts modeling for social media client applications to enhance investigatory learning mechanisms. J Intell Fuzzy Syst 31(5):2645–2658. https://doi.org/10.3233/JIFS-169105
Article Google Scholar
Alazab M (2015) Profiling and classifying the behavior of malicious codes. J Syst Softw 100:91–102. https://doi.org/10.1016/j.jss.2014.10.031
Article Google Scholar
Apap F, Honig A, Hershkop S, Eskin E, Stolfo SJ (2002) Detecting malicious software by monitoring anomalous Windows Registry accesses. In: Proceedings of the 5th International Symposium on Recent Advances in Intrusion Detection (RAID’02), pp 36-53. https://doi.org/10.1007/3-540-36084-0_3. Springer, Berlin
AV-TEST (2017) Security report 2016/17 https://www.av-test.org/fileadmin/pdf/security_report/AV-TEST_security_report_2016-2017.pdf
Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6(1):5–20. https://doi.org/10.1016/j.inffus.2004.04.004
Article Google Scholar
Carvey H (2016) Windows Registry Forensics: Advanced Digital Forensic Analysis of the Windows Registry, 2nd edn. Syngress, Amsterdam
Google Scholar
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):15:1–15:58. https://doi.org/10.1145/1541880.1541882
Article Google Scholar
Christodorescu M, Jha S (2003) Static analysis of executables to detect malicious patterns. In: Proceedings of the 12th USENIX Security Symposium (Security’03), pp 169-186, USENIX Association, Berkeley, CA, USA
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Ding Y, Xia X, Chen S, Li Y (2018) A malware detection method based on family behavior graph. Comput Secur 73:73–86. https://doi.org/10.1016/j.cose.2017.10.007
Article Google Scholar
Ding Y, Yuan X, Tang K, Xiao X, Zhang Y (2013) A fast malware detection algorithm based on objective-oriented association mining. Comput Secur 39:315–324. https://doi.org/10.1016/j.cose.2013.08.008
Article Google Scholar
Duin RPW, Tax DMJ (2000) Experiments with classifier combining rules. In: Proceedings of the 1st International Workshop on Multiple Classifier Systems (MCS’00). https://doi.org/10.1007/3-540-45014-9_2. Springer, Berlin, pp 16–29
Eskin E (2002) Probabilistic anomaly detection over discrete records using inconsistency checks. Technical report, Department of Computer Science Columbia University
Fattori A, Lanzi A, Balzarotti D, Kirda E (2015) Hypervisor-based malware protection with AccessMiner. Comput Secur 52:33–50. https://doi.org/10.1016/j.cose.2015.03.007
Article Google Scholar
Galal HS, Mahdy YB, Atiea MA (2016) Behavior-based features model for malware detection. J Comput Virol Hacking Techniques 12(2):59–67. https://doi.org/10.1007/s11416-015-0244-0
Article Google Scholar
Gautam C, Tiwari A, Leng Q (2017) On the construction of extreme learning machine for online and offline one-class classification–an expanded toolbox. Neurocomputing 261:126–143. https://doi.org/10.1016/j.neucom.2016.04.070
Article Google Scholar
Ghaffari F, Abadi M, Tajoddin A (2017) AMD-EC: anomaly-based android malware detection using ensemble classifiers. In: Proceedings of the 2017 25th Iranian Conference on Electrical Engineering (ICEE’17), pp 2247-2252. https://doi.org/10.1109/IranianCEE.2017.7985436. IEEE, Piscataway
Guo X, Yin Y, Dong C, Yang G, Zhou G (2008) On the class imbalance problem. In: Proceedings of the 2008 4th International Conference on Natural Computation (ICNC’08), pp 192-201. https://doi.org/10.1109/ICNC.2008.871. IEEE, Piscataway
Gupta S, Kumar P (2015) An immediate system call sequence based approach for detecting malicious program executions in cloud environment. Wirel Pers Commun 81(1):405–425. https://doi.org/10.1007/s11277-014-2136-x
Article Google Scholar
Halsey M, Bettany A (2015) Windows Registry troubleshooting. Apress, New York. https://doi.org/10.1007/978-1-4842-0992-9
Book Google Scholar
Heller KA, Svore KM, Keromytis AD, Stolfo SJ (2003) One class support vector machines for detecting anomalous Windows Registry accesses. In: Proceedings of the 2003 ICDM Workshop on Data Mining for Computer Security (DMSEC’03), pp 1–8. https://doi.org/10.7916/D85M6CFF
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844. https://doi.org/10.1109/34.709601
Article Google Scholar
Hollander M, Wolfe DA, Chicken E (2014) Nonparametric statistical methods, 3rd edn. Wiley, Hoboken
MATH Google Scholar
Hosseini Bamakan SM, Wang H, Shi Y (2017) Ramp loss K-support vector classification-regression: a robust and sparse multi-class approach to the intrusion detection problem. Knowl-Based Syst 126:113–126. https://doi.org/10.1016/j.knosys.2017.03.012
Article Google Scholar
Jodavi M, Abadi M (2015) JSObfusDetector: a binary PSO-based one-class classifier ensemble to detect obfuscated JavaScript code. In: Proceedings of the 2015 International Symposium on Artificial Intelligence and Signal Processing (AISP’15), pp 322-327. https://doi.org/10.1109/AISP.2015.7123508. IEEE, Piscataway
Jodavi M, Abadi M, Parhizkar E (2015) DbDHunter: an ensemble-based anomaly detection approach to detect drive-by download attacks. In: Proceedings of the 2015 5th International Conference on Computer and Knowledge Engineering (ICCKE’15), pp 273-278. https://doi.org/10.1109/ICCKE.2015.7365841. IEEE, Piscataway
Juszczak P, Tax DMJ, Pekalska E, Duin RPW (2009) Minimum spanning tree based one-class classifier. Neurocomputing 72(7–9):1859–1869. https://doi.org/10.1016/j.neucom.2008.05.003
Article Google Scholar
Karaboga D, Gorkemli B, Ozturk C, Karaboga N (2014) A comprehensive survey: artificial bee colony (ABC) algorithm and applications. Artif Intell Rev 42(1):21–57. https://doi.org/10.1007/s10462-012-9328-0
Article Google Scholar
Kazem A, Sharifi E, Hussain FK, Saberi M, Hussain OK (2013) Support vector regression with chaos-based firefly algorithm for stock market price forecasting. Appl Soft Comput 13(2):947–958. https://doi.org/10.1016/j.asoc.2012.09.024
Article Google Scholar
Khan SS, Madden MG (2014) One-class classification: taxonomy of study and review of techniques. Knowl Eng Rev 29(3):345–374. https://doi.org/10.1017/S026988891300043X
Article Google Scholar
Khatri Y (2015) Forensic implications of System Resource Usage Monitor (SRUM) data in Windows 8. Digit Investig 12:53–65. https://doi.org/10.1016/j.diin.2015.01.002
Article Google Scholar
Khreich W, Murtaza SS, Hamou-Lhadj A, Talhi C (2018) Combining heterogeneous anomaly detectors for improved software security. J Syst Softw 137:415–429. https://doi.org/10.1016/j.jss.2017.02.050
Article Google Scholar
Kirat D, Vigna G (2015) MalGene: Automatic extraction of malware analysis evasion signature. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15), pp 769-780. https://doi.org/10.1145/2810103.2813642. ACM, New York
Kirat D, Vigna G, Kruegel C (2014) BareCloud: bare-metal analysis-based evasive malware detection. In: Proceedings of the 23rd USENIX Security Symposium (Security’14), pp 287-301, USENIX Association, Berkeley, CA, USA
Kolbitsch C, Comparetti PM, Kruegel C, Kirda E, Zhou X, Wang X (2009) Effective and efficient malware detection at the end host. In: Proceedings of the 18th USENIX Security Symposium (Security’09), pp 351-366, USENIX Association, Berkeley, CA, USA
Kramer O (2017) Genetic algorithm essentials. Springer international publishing. Cham, Switzerland. https://doi.org/10.1007/978-3-319-52156-5
Book Google Scholar
Krawczyk B, Woźniak M (2016) Dynamic classifier selection for one-class classification. Knowl-Based Syst 107:43–53. https://doi.org/10.1016/j.knosys.2016.05.054
Article Google Scholar
Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207. https://doi.org/10.1023/A:1022859003006
Article MATH Google Scholar
Lei B, Xu G, Feng M, Zou Y, van der Heijden F, de Ridder D, Tax DMJ (2017) Classification, parameter estimation and state estimation: an engineering approach using MATLAB, 2nd edn. Wiley, Hoboken
Google Scholar
Liu J, Miao Q, Sun Y, Song J, Quan Y (2016) Fast structural ensemble for one-class classification. Pattern Recogn Lett 80:179–187. https://doi.org/10.1016/j.patrec.2016.06.028
Article Google Scholar
Long NC, Meesad P, Unger H (2015) A highly accurate firefly based algorithm for heart disease prediction. Expert Syst Appl 42(21):8221–8231. https://doi.org/10.1016/j.eswa.2015.06.024
Article Google Scholar
Luo L, Ming J, Wu D, Liu P, Zhu S (2017) Semantics-based obfuscation-resilient binary code similarity comparison with applications to software and algorithm plagiarism detection. IEEE Trans Softw Eng 43(12):1157–1177. https://doi.org/10.1109/TSE.2017.2655046
Article Google Scholar
Mandayam Comar P, Liu L, Saha S, Tan PN, Nucci A (2013) Combining supervised and unsupervised learning for zero-day malware detection. In: Proceedings of the 32nd IEEE International Conference on Computer Communications (INFOCOM’13), pp 2022-2030. https://doi.org/10.1109/INFCOM.2013.6567003. IEEE, Piscataway
Miao Q, Liu J, Cao Y, Song J (2016) Malware detection using bilayer behavior abstraction and improved one-class support vector machines. Int J Inf Secur 15(4):361–379. https://doi.org/10.1007/s10207-015-0297-6
Article Google Scholar
Miller RG Jr (1997) Beyond ANOVA: basics of applied statistics. Chapman and Hall/CRC, London
Book MATH Google Scholar
Naval S, Laxmi V, Rajarajan M, Gaur MS, Conti M (2015) Employing program semantics for malware detection. IEEE Trans Inf Forensics Secur 10(12):2591–2604. https://doi.org/10.1109/TIFS.2015.2469253
Article Google Scholar
Neri F, Cotta C (2012) Memetic algorithms and memetic computing optimization: a literature review. Swarm Evol Comput 2:1–14. https://doi.org/10.1016/j.swevo.2011.11.003
Article Google Scholar
Nissim N, Lapidot Y, Cohen A, Elovici Y (2018) Trusted system-calls analysis methodology aimed at detection of compromised virtual machines using sequential mining. Knowl-Based Syst 153:147–175. https://doi.org/10.1016/j.knosys.2018.04.033
Article Google Scholar
O’Kane P, Sezer S, Mclaughlin K (2011) Obfuscation: the hidden malware. IEEE Secur Priv 9(5):41–47. https://doi.org/10.1109/MSP.2011.98
Article Google Scholar
Parhizkar E, Abadi M (2015) BeeOWA: a novel approach based on ABC algorithm and induced OWA operators for constructing one-class classifier ensembles. Neurocomputing 166:367–381. https://doi.org/10.1016/j.neucom.2015.03.051
Article Google Scholar
Reformat M, Yager RR (2008) Building ensemble classifiers using belief functions and OWA operators. Soft Comput 12(6):543–558. https://doi.org/10.1007/s00500-007-0227-2
Article MATH Google Scholar
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1):1–39. https://doi.org/10.1007/s10462-009-9124-7
Article MathSciNet Google Scholar
Rudd EM, Rozsa A, Günther M, Boult TE (2017) A survey of stealth malware: attacks, mitigation measures, and steps toward autonomous open world solutions. IEEE Commun Surv Tutorials 19(2):1145–1172. https://doi.org/10.1109/COMST.2016.2636078
Article Google Scholar
Sengupta S, Das AK (2016) An approach to development of an ensemble classification system. In: Proceedings of the 2016 2nd International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN’16), pp 218-223. https://doi.org/10.1109/ICRCICN.2016.7813659. IEEE, Piscataway
Shen YD, Zhang Z, Yang Q (2002) Objective-oriented utility-based association mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM’02), pp 426-433. https://doi.org/10.1109/ICDM.2002.1183938. IEEE, Piscataway
Stolfo SJ, Apap F, Eskin E, Heller KA, Hershkop S, Honig A, Svore KM (2005) A comparative evaluation of two algorithms for Windows Registry anomaly detection. J Comput Secur 13(4):659–693. https://doi.org/10.3233/JCS-2005-13403
Article Google Scholar
Su H, Cai Y, Du Q (2017) Firefly-algorithm-inspired framework with band selection and extreme learning machine for hyperspectral image classification. IEEE J Sel Topics Appl Earth Observations Remote Sens 10(1):309–320. https://doi.org/10.1109/JSTARS.2016.2591004
Article Google Scholar
Symantec (2016) Internet security threat report (ISTR) https://www.symantec.com/content/dam/symantec/docs/reports/istr-21-2016-en.pdf
Tax DMJ (2018) DDTools, the data description toolbox for MATLAB. Version 2.1.3
Wasikowski M, Chen XW (2010) Combating the small sample class imbalance problem using feature selection. IEEE Trans Knowl Data Eng 22(10):1388–1400. https://doi.org/10.1109/TKDE.2009.187
Article Google Scholar
Xing HJ, Ji M (2018) Robust one-class support vector machine with rescaled hinge loss function. Pattern Recogn 84:152–164. https://doi.org/10.1016/j.patcog.2018.07.015
Article Google Scholar
Xing HJ, Wang XZ (2017) Selective ensemble of SVDDs with Renyi entropy based diversity measure. Pattern Recogn 61:185–196. https://doi.org/10.1016/j.patcog.2016.07.038
Article Google Scholar
Yager RR (1988) On ordered weighted averaging aggregation operators in multicriteria decisionmaking. IEEE Trans Syst Man Cybern 18(1):183–190. https://doi.org/10.1109/21.87068
Article MathSciNet MATH Google Scholar
Yager RR (1993) Families of OWA operators. Fuzzy Sets Syst 59(2):125–148. https://doi.org/10.1016/0165-0114(93)90194-M
Article MathSciNet MATH Google Scholar
Yager RR, Grichnik AJ, Yager RL (2014) A soft computing approach to controlling emissions under imperfect sensors. IEEE Trans Syst Man Cybern 44(6):687–691. https://doi.org/10.1109/TSMC.2013.2268735
Article Google Scholar
Yahyazadeh M, Abadi M (2015) BotGrab: a negative reputation system for botnet detection. Comput Electr Eng 41:68–85. https://doi.org/10.1016/j.compeleceng.2014.10.010
Article Google Scholar
Yang XS (2010) Firefly algorithm, stochastic test functions and design optimisation. Int J Bio-Inspired Comput 2(2):78–84. https://doi.org/10.1504/IJBIC.2010.032124
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran
Asghar Tajoddin & Mahdi Abadi

Authors

Asghar Tajoddin
View author publications
You can also search for this author in PubMed Google Scholar
Mahdi Abadi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mahdi Abadi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tajoddin, A., Abadi, M. RAMD: registry-based anomaly malware detection using one-class ensemble classifiers. Appl Intell 49, 2641–2658 (2019). https://doi.org/10.1007/s10489-018-01405-0

Download citation

Published: 29 January 2019
Issue Date: 15 July 2019
DOI: https://doi.org/10.1007/s10489-018-01405-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RAMD: registry-based anomaly malware detection using one-class ensemble classifiers

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A survey on ensemble learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RAMD: registry-based anomaly malware detection using one-class ensemble classifiers

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A survey on ensemble learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation