Skip to main content
Log in

Evaluation of machine learning classifiers for mobile malware detection

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Mobile devices have become a significant part of people’s lives, leading to an increasing number of users involved with such technology. The rising number of users invites hackers to generate malicious applications. Besides, the security of sensitive data available on mobile devices is taken lightly. Relying on currently developed approaches is not sufficient, given that intelligent malware keeps modifying rapidly and as a result becomes more difficult to detect. In this paper, we propose an alternative solution to evaluating malware detection using the anomaly-based approach with machine learning classifiers. Among the various network traffic features, the four categories selected are basic information, content based, time based and connection based. The evaluation utilizes two datasets: public (i.e. MalGenome) and private (i.e. self-collected). Based on the evaluation results, both the Bayes network and random forest classifiers produced more accurate readings, with a 99.97 % true-positive rate (TPR) as opposed to the multi-layer perceptron with only 93.03 % on the MalGenome dataset. However, this experiment revealed that the k-nearest neighbor classifier efficiently detected the latest Android malware with an 84.57 % true-positive rate higher than other classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Amos B, Turner H, White J (2013) Applying machine learning classifiers to dynamic android malware detection at scale. In: Proceedings of the 9th international wireless communications and mobile computing conference (IWCMC), Sardinia, Italy, pp 1666–1671

  • Android (2013) Android 4.2, Jelly Bean. http://www.android.com/about/jelly-bean/. Accessed June 2013

  • Anuar NB, Sallehudin H, Gani A, Zakaria O (2008) Identifying false alarm for network intrusion detection system using hybrid data mining and decision tree. Malays J Comput Sci 21(2):101–115

    Google Scholar 

  • Anubis (2013) Anubis: analyzing unknown binaries. http://anubis.iseclab.org/. Accessed Feb 2013

  • Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K (2014) DREBIN: effective and explainable detection of android malware in your pocket. In: Proceedings of the 2014 network and distributed system security (NDSS) symposium, San Diego, USA (2014)

  • Arstechnica (2013) More BadNews for android: new malicious apps found in google play. http://arstechnica.com/security/2013/04/more-badnews-for-android-new-malicious-apps-found-in-google-play/. Accessed 1st Jan 2013

  • Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Burguera I, Zurutuza U, Nadjm-Tehrani S (2011) Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices, Chicago, pp 15–26

  • Burguera I, Zurutuza U, Nadjm-Tehrani S (2011) Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices, Chicago, USA, pp 15–26

  • Curiac D-I, Volosencu C (2012) Ensemble based sensing anomaly detection in wireless sensor networks. Exp Syst Appl 39(10):9087–9096

    Article  Google Scholar 

  • Dini G, Martinelli F, Saracino A, Sgandurra D (2012) MADAM: a multi-level anomaly detector for android malware. In: Proceedings of the 6th international conference on mathematical methods, models and architectures for computer network security (MMM-ACNS 2012), Saint Petersburg, Russia, pp 240–253

  • Egele M, Scholte T, Kirda E, Kruegel C (2008) A survey on automated dynamic malware-analysis techniques and tools. ACM Comput Surv 44(2):1–42

    Article  Google Scholar 

  • Eskandari M, Hashemi S (2012) A graph mining approach for detecting unknown malwares. J Vis Lang Comput 23(3):154–162

    Article  Google Scholar 

  • Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874

    Article  MathSciNet  Google Scholar 

  • Felt AP, Finifter M, Chin E, Hanna S, Wagner D (2011) A survey of mobile malware in the wild. In: Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices, Chicago, Illinois, USA, pp 3–14

  • Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163

    Article  MATH  Google Scholar 

  • F-Secure (2013) Android accounted for 79% of all mobile malware in 2012, 96% in Q4 alone. http://techcrunch.com/2013/03/07/f-secure-android-accounted-for-79-of-all-mobile-malware-in-2012-96-in-q4-alone/. Accessed 1st June 2013

  • García-Teodoro P, Díaz-Verdejo J, Maciá-Fernández G, Vázquez E (2009) Anomaly-based network intrusion detection: techniques, systems and challenges. Comput Secur 28(1–2):18–28

    Article  Google Scholar 

  • Gogoi P, Bhattacharyya DK, Borah B, Kalita JK (2013) MLH-IDS: a multi-level hybrid intrusion detection method. Comput J 2013 doi:10.1093/comjnl/bxt044. Online. http://comjnl.oxfordjournals.org/content/early/2013/05/12/comjnl.bxt044.abstract. Accessed 12 May 2013

  • Gribskov M, Robinson NL (1996) Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching. Comput Chem 20(1):25–33

    Article  Google Scholar 

  • Hardwarezone (2013) Trend micro predicts android malware increase by 185%. http://www.hardwarezone.com.ph/tech-news-trend-micro-predicts-android-malware-increase-185. Accessed 1st Jan 2013

  • Huang C-Y, Tsai Y-T, Hsu C-H (2013) Performance evaluation on permission-based detection for android malware. In: Pan, J-S, Yang C-N, Lin C-C (eds) Advances in intelligent systems and applications, vol 2. Springer, Berlin, pp 111–120

  • Hyo-Sik H, Mi-Jung C (2013) Analysis of android malware detection performance using machine learning classifiers. In: Proceedings of the international conference on ICT convergence (ICTC), Jeju, Ethiopia, pp 490–495

  • Kolter JZ, Maloof MA (2006) Learning to detect and classify malicious executables in the wild. J Mach Learn Res 7:2721–2744

    MathSciNet  MATH  Google Scholar 

  • Kotsiantis SB, Zaharakis ID, Pintelas PE (2006) Machine learning: a review of classification and combining techniques. Artif Intell Rev 26(3):159–190

    Article  Google Scholar 

  • Lai Y, Liu Z (2011) Unknown malicious code detection based on bayesian. Procedia Eng 15:3836–3842

    Article  Google Scholar 

  • Lamiaa Ibrahim MS, Rahman Azema Abd El, Zeidan Amany, Ragb Maha (2013) Crucial role of CD4+CD 25+ FOXP3+ T regulatory cell, interferon-\(\gamma \) and interleukin-16 in malignant and tuberculous pleural effusions. Immunol Investig 42(2):122–136

    Article  Google Scholar 

  • Lee W, Stolfo SJ (2000) A framework for constructing features and models for intrusion detection systems. ACM Trans Inf Syst Secur 3(4):227–261

    Article  Google Scholar 

  • Liang S, Keep AW, Might M, Lyde S, Gilray T, Aldous P, Horn DV (2013) Sound and precise malware analysis for android via pushdown reachability and entry-point saturation. In: Proceedings of the third ACM workshop on security and privacy in smartphones & mobile devices, Berlin, Germany, pp 21–32

  • Liao Y, Vemuri VR (2002) Use of k-nearest neighbor classifier for intrusion detection. Comput Secur 21(5):439–448

    Article  Google Scholar 

  • Lookout (2010) Security alert: geinimi, sophisticated new android trojan found in wild. https://blog.lookout.com/blog/2010/12/29/geinimi_trojan/. Accessed 1st July 2014

  • Metz CE (1978) Basic principles of ROC analysis. Semin Nucl Med 8(4):283–298

  • Oberheide J, Veeraraghavan K, Cooke E, Flinn J, Jahanian F (2008) Virtualized in-cloud security services for mobile devices. In: Proceedings of the 1st workshop on virtualization in mobile computing, Breckenridge, Colorado, pp 31–35

  • Pal SK, Mitra S (1992) Multilayer perceptron, fuzzy sets, and classification. IEEE Trans Neural Netw 3(5):683–697

    Article  Google Scholar 

  • Patel A, Taghavi M, Bakhtiyari K (2013) An intrusion detection and prevention system in cloud computing: a systematic review. J Netw Comput Appl 36(1):25–41

    Article  Google Scholar 

  • Play G (2013) Shop android apps. https://play.google.com/store?hl=en. Accessed February 2013

  • Project MG (2013) Android malware genome project. http://www.malgenomeproject.org/. Accessed Feb 2013

  • Raffetseder T, Kruegel C, Kirda E (2007) Detecting system emulators. In: Proceedings of the 10th international conference ISC, Valparaíso, Chile, pp 1–18

  • SandDroid (2013) SandDroid-an APK analysis sandbox. http://sanddroid.xjtu.edu.cn/. Accessed April 2013

  • Sangkatsanee P, Wattanapongsakorn N, Charnsripinyo C (2011) Practical real-time intrusion detection using machine learning approaches. Comput Commun 34(18):2227–2235

    Article  Google Scholar 

  • Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Nieves J, Bringas PG (2013) MAMA: manifest analysis for malware detection in android. Cybern Syst 44(6–7):469–488

    Article  Google Scholar 

  • Sarma BP, Li N, Gates C, Potharaju R, Nita-Rotaru C and Molloy I (2012), “Android permissions: a perspective combining risks and benefits. In: Proceedings of the 17th ACM symposium on access control models and technologies, Newark, New Jersey, USA, pp 13–22

  • Schneider J (1997) Cross validation. http://www.cs.cmu.edu/~schneide/tut5/node42.html. Accessed July 2013

  • Security P (2011) Rootkits: almost invisible malware. http://www.pandasecurity.com/homeusers/security-info/types-malware/rootkit/. Accessed July 2013

  • Seo S-H, Gupta A, Mohamed Sallam A, Bertino E, Yim K (2013) Detecting mobile malware threats to homeland security through static analysis. J Netw Comput Appl doi:10.1016/j.jnca.2013.05.008. Online. http://www.sciencedirect.com/science/article. Accessed Oct 2013

  • Shabtai A, Kanonov U, Elovici Y, Glezer C, Weiss Y (2012) Andromaly: a behavioral malware detection framework for android devices. J Intell Inf Syst 38(1):161–190

    Article  Google Scholar 

  • Shabtai A, Tenenboim-Chekina L, Mimran D, Rokach L, Shapira B, Elovici Y (2014) Mobile malware detection through analysis of deviations in application network behavior. Comput Secur 43:1–18

  • Shamshirband S, Anuar NB, Kiah MLM, Patel A (2013) An appraisal and design of a multi-agent system based cooperative wireless intrusion detection computational intelligence technique. Eng Appl Artif Intell 26(9):2105–2127

    Article  Google Scholar 

  • Shamshirband S, Anuar NB, Kiah MLM, Rohani VA, Petković D, Misra S, Khan AN (2014) Co-FAIS: cooperative fuzzy artificial immune system for detecting intrusion in wireless sensor networks. J Netw Comput Appl 42:102–117

  • Shamshirband S, Patel A, Anuar NB, Kiah MLM, Abraham A (2014) Cooperative game theoretic approach using fuzzy Q-learning for detecting and preventing intrusions in wireless sensor networks. Eng Appl Artif Intell 32:228–241

  • SlideME (2013) SlideME \(\vert \) android apps market: download free & paid android application. http://slideme.org/. Accessed 1st Oct 2013

  • Sohr K, Mustafa T, Nowak A (2011) Software security aspects of Java-based mobile phones. In: Proceedings of the 2011 ACM symposium on applied computing, Taichung, Taiwan, pp 1494–1501

  • Spackman KA (1989) Signal detection theory: valuable tools for evaluating inductive learning. In: Proceedings of the 6th international workshop on machine learning, Ithaca, New York, USA, pp 160–163

  • Su X, Chuah M, Tan G (2012) Smartphone dual defense protection framework: detecting malicious applications in android markets. In: Proceedings of the mobile ad-hoc and sensor networks (MSN), 2012 eighth international conference on, Chengdu, China, pp 153–160

  • Survey G (2013) Our mobile planet: global smartphone user. http://services.google.com/fh/files/blogs/final_global_smartphone_user_study_2012.pdf. Accessed June 2013

  • Symantec (2013) Android ransomware predictions hold true. http://www.symantec.com/connect/blogs/android-ransomware-predictions-hold-true. Accessed 1st Sept 2013

  • Teufl P, Ferk M, Fitzek A, Hein D, Kraxberger S, Orthacker C (2013) Malware detection by applying knowledge discovery processes to application metadata on the Android Market (Google Play). In: Security and communication networks. doi:10.1002/sec.675 [Online]. http://dx.doi.org/10.1002/sec.675. Accessed 1st April 2014

  • Tin Kam H (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

  • tPacketCapturePro (2013) tPacketCapture-Capture Communication Packets. http://www.taosoftware.co.jp/en/android/packetcapture/. Accessed April 2013

  • tshark (2013) tshark-the wireshark network analyzer. http://www.wireshark.org/docs/man-pages/tshark.html. Accessed Feb 2013

  • Verwoerd T, Hunt R (2002) Intrusion detection techniques and approaches. Comput Commun 25(15):1356–1365

    Article  Google Scholar 

  • Yajin Z, Xuxian J (2012) Dissecting android malware: characterization and evolution. In: Proceedings of the 2012 IEEE symposium on security and privacy (SP), San Fransico, USA, pp 95–109

  • Yerima SY, Sezer S, McWilliams G, Muttik I (2013) A new android malware detection approach using bayesian classification. In: Proceedings of the 2013 IEEE 27th international conference on advanced information networking and applications (AINA), Barcelona, Spain, pp 121–128

  • Zhao M, Zhang T, Ge F, Yuan Z (2012) RobotDroid: a lightweight malware detection framework on smartphones. J Netw 7(4):715–722

    Google Scholar 

  • Zheng M, Sun M, Lui J (2013) DroidAnalytics: a signature based analytic system to collect, extract, analyze and associate android malware. http://arxiv.org/abs/1302.7212. Accessed 1st Oct 2013

Download references

Acknowledgments

The authors would like to thank Hamid Talebian and Shahaboddin Shamshirband for their valuable comments. This work is supported by the Ministry of Science, Technology and Innovation, Malaysia under Grant eScienceFund 01-01-03- SF0914.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Feizollah.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Narudin, F.A., Feizollah, A., Anuar, N.B. et al. Evaluation of machine learning classifiers for mobile malware detection. Soft Comput 20, 343–357 (2016). https://doi.org/10.1007/s00500-014-1511-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-014-1511-6

Keywords

Navigation