Abstract
To evade detection, attackers usually obfuscate malicious Android applications. These malicious applications often have randomly generated application IDs or package names, and they are also often signed with randomly created certificates. Conventional machine learning models for detecting such malware are neither robust enough nor scalable to the volume of Android applications that are being produced on a daily basis. Recurrent neural networks (RNN) and convolutional neural networks (CNN) have been applied to identify malware by learning patterns in sequence data. We propose a novel malware classification method for malicious Android applications using stacked RNNs and CNNs so that our model learns the generalized correlation between obfuscated string patterns from an application’s package name and the certificate owner name. The model extracts machine learning features using embedding and gated recurrent units (GRU), and an additional CNN unit further optimizes the feature extraction process. Our experiments demonstrate that our approach outperforms Ngram-based models and that our feature extraction method is robust to obfuscation and sufficiently lightweight for Android devices.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K (2014) Drebin: effective and explainable detection of Android malware in your pocket. In: Network and distributed system security symposium
Zhou Y, Jiang X (2016) Dissecting android malware: characterization and evolution. In: Proceedings of the 2012 IEEE symposium on security and privacy
Billah Karbab E, Debbabi M, Derhab A, Mouheb D, Android malware detection using deep learning on API method sequences. arXiv preprint, arXiv:1712.08996
Peiravian N, Zhu X (2013) Machine learning for android malware detection using permission and API calls. Int Conf Tools Artif Intell:300–305
Chen W, Aspinall D, Gordon A, Sutton C, Muttik I (2016) More semantics more robust: improving android malware classifiers. In: Proceedings of the 9th ACM conference on security & privacy in wireless and mobile networks, pp 18–20
WangEmail W, Wang Z (2018) Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J Ambient Intell Humaniz Comput: 1–19
Android malware. https://nakedsecurity.sophos.com/2017/11/07/2018-malware-forecast-the-onward-march-of-android-malware/
Andriod PHA. https://source.android.com/security/reports/Google_Android_Security_PHA_ classifications.pdf
Android certificate. https://developer.android.com/studio/publish/app-signing
Suarez-Tangil G, Kumar Dash S, Ahmadi M, Kinder J, Giacinto G, Cavallaro L (2017) DroidSieve: fast and accurate classification of obfuscated android malware. In: Proceedings of the seventh ACM on conference on data and application security and privacy. ACM, New York, pp 309–320
Android permissions. https://www.researchgate.net/publication/296704790_Permission_ Analysis_for_Android_Malware_Detection
Aafer Y, Du W, Yin H (2013) Droidapiminer: mining api-level features for robust malware detection in android. In: International conference on security and privacy in communication systems, pp 86–103
Huda S, Abawajy J, Alazab M, Abdollalihian M, Islam R, Yearwood J (2016) Hybrids of support vector machine wrapper and filter based framework for malware detection. Futur Gener Comput Syst 55:376–390
Kumar Dash S, Suarez-Tangil G, Khan S (2016) DroidScribe: classifying android malware based on runtime behavior. In: IEEE security and privacy workshops. Conference Publishing Services, IEEE Computer Society, Los Alamitos, pp 252–261
Chaba S, Kumar R, Pant R, Dave M, Malware detection approach for android systems using system call logs. arXiv preprint, arXiv:1709.08805
Rhodea M, Burnapa P, Jonesb K, Early-stage malware prediction using recurrent neural networks. arXiv preprint, arXiv:1708.03513
Malik S, Khatter K (2016) System call analysis of android malware families. Indian J Sci Technol 9
Ahmed F, Hameed H, Shafiq MZ, Farooq M (2009) Using spatio-temporal information in API calls with machine learning algorithms for malware detection. In: Proceedings of the 2nd ACM workshop on security and artificial intelligence. ACM, New York, pp 55–62
Tian R, Islam R, Batten L, Versteeg S (2010) Differentiating malware from cleanware using behavioural analysis. In: International conference on malicious and unwanted software
Saxe J, Berlin K, eXpose: a character-level convolutional neural network with embeddings for detecting malicious URLs, file paths and registry keys. arXiv preprint, arXiv:1702.08568
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Cho K, Merrienboer Bv, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y, Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint, arXiv:1406.1078
Vinayakumar R, Soman KP, Poornachandran P (2017) Deep android malware detection and classification. In: International conference on advances in computing, communications and informatics
Ioffe S, Szegedy C, Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint, arXiv:1502.03167
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Keras. https://keras.io/
Kingma DP, Ba J, Adam: a method for stochastic optimization. arXiv preprint, arXiv:1412.6980
Virustotal. https://www.virustotal.com/
Liu K, Li Y, Xu N, Learn to combine modalities in multimodal deep learning. arXiv preprint, arXiv:1805.11730
Rieck K, Trinius P, Willems C, Holz T (2011) Automatic analysis of malware behavior using machine learning. J Comp Secur 19:639–668
Alazab M (2015) Profiling and classifying the behaviour of malicious codes. J Syst Softw, Elsevier 100:91–102
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Lee, W.Y., Saxe, J., Harang, R. (2019). SeqDroid: Obfuscated Android Malware Detection Using Stacked Convolutional and Recurrent Neural Networks. In: Alazab, M., Tang, M. (eds) Deep Learning Applications for Cyber Security. Advanced Sciences and Technologies for Security Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-13057-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-13057-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13056-5
Online ISBN: 978-3-030-13057-2
eBook Packages: Computer ScienceComputer Science (R0)