SeqDroid: Obfuscated Android Malware Detection Using Stacked Convolutional and Recurrent Neural Networks

Lee, William Younghoo; Saxe, Joshua; Harang, Richard

doi:10.1007/978-3-030-13057-2_9

William Younghoo Lee¹²,
Joshua Saxe¹² &
Richard Harang¹²

Part of the book series: Advanced Sciences and Technologies for Security Applications ((ASTSA))

2894 Accesses
42 Citations

Abstract

To evade detection, attackers usually obfuscate malicious Android applications. These malicious applications often have randomly generated application IDs or package names, and they are also often signed with randomly created certificates. Conventional machine learning models for detecting such malware are neither robust enough nor scalable to the volume of Android applications that are being produced on a daily basis. Recurrent neural networks (RNN) and convolutional neural networks (CNN) have been applied to identify malware by learning patterns in sequence data. We propose a novel malware classification method for malicious Android applications using stacked RNNs and CNNs so that our model learns the generalized correlation between obfuscated string patterns from an application’s package name and the certificate owner name. The model extracts machine learning features using embedding and gated recurrent units (GRU), and an additional CNN unit further optimizes the feature extraction process. Our experiments demonstrate that our approach outperforms Ngram-based models and that our feature extraction method is robust to obfuscation and sufficiently lightweight for Android devices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K (2014) Drebin: effective and explainable detection of Android malware in your pocket. In: Network and distributed system security symposium
Google Scholar
Zhou Y, Jiang X (2016) Dissecting android malware: characterization and evolution. In: Proceedings of the 2012 IEEE symposium on security and privacy
Google Scholar
Billah Karbab E, Debbabi M, Derhab A, Mouheb D, Android malware detection using deep learning on API method sequences. arXiv preprint, arXiv:1712.08996
Google Scholar
Peiravian N, Zhu X (2013) Machine learning for android malware detection using permission and API calls. Int Conf Tools Artif Intell:300–305
Google Scholar
Chen W, Aspinall D, Gordon A, Sutton C, Muttik I (2016) More semantics more robust: improving android malware classifiers. In: Proceedings of the 9th ACM conference on security & privacy in wireless and mobile networks, pp 18–20
Google Scholar
WangEmail W, Wang Z (2018) Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J Ambient Intell Humaniz Comput: 1–19
Google Scholar
Android malware. https://nakedsecurity.sophos.com/2017/11/07/2018-malware-forecast-the-onward-march-of-android-malware/
Andriod PHA. https://source.android.com/security/reports/Google_Android_Security_PHA_ classifications.pdf
Android certificate. https://developer.android.com/studio/publish/app-signing
Suarez-Tangil G, Kumar Dash S, Ahmadi M, Kinder J, Giacinto G, Cavallaro L (2017) DroidSieve: fast and accurate classification of obfuscated android malware. In: Proceedings of the seventh ACM on conference on data and application security and privacy. ACM, New York, pp 309–320
Google Scholar
Android permissions. https://www.researchgate.net/publication/296704790_Permission_ Analysis_for_Android_Malware_Detection
Aafer Y, Du W, Yin H (2013) Droidapiminer: mining api-level features for robust malware detection in android. In: International conference on security and privacy in communication systems, pp 86–103
Google Scholar
Huda S, Abawajy J, Alazab M, Abdollalihian M, Islam R, Yearwood J (2016) Hybrids of support vector machine wrapper and filter based framework for malware detection. Futur Gener Comput Syst 55:376–390
Article Google Scholar
Kumar Dash S, Suarez-Tangil G, Khan S (2016) DroidScribe: classifying android malware based on runtime behavior. In: IEEE security and privacy workshops. Conference Publishing Services, IEEE Computer Society, Los Alamitos, pp 252–261
Google Scholar
Chaba S, Kumar R, Pant R, Dave M, Malware detection approach for android systems using system call logs. arXiv preprint, arXiv:1709.08805
Google Scholar
Rhodea M, Burnapa P, Jonesb K, Early-stage malware prediction using recurrent neural networks. arXiv preprint, arXiv:1708.03513
Google Scholar
Malik S, Khatter K (2016) System call analysis of android malware families. Indian J Sci Technol 9
Google Scholar
Ahmed F, Hameed H, Shafiq MZ, Farooq M (2009) Using spatio-temporal information in API calls with machine learning algorithms for malware detection. In: Proceedings of the 2nd ACM workshop on security and artificial intelligence. ACM, New York, pp 55–62
Google Scholar
Tian R, Islam R, Batten L, Versteeg S (2010) Differentiating malware from cleanware using behavioural analysis. In: International conference on malicious and unwanted software
Google Scholar
Saxe J, Berlin K, eXpose: a character-level convolutional neural network with embeddings for detecting malicious URLs, file paths and registry keys. arXiv preprint, arXiv:1702.08568
Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Cho K, Merrienboer Bv, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y, Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint, arXiv:1406.1078
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2017) Deep android malware detection and classification. In: International conference on advances in computing, communications and informatics
Google Scholar
Ioffe S, Szegedy C, Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint, arXiv:1502.03167
Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Keras. https://keras.io/
Kingma DP, Ba J, Adam: a method for stochastic optimization. arXiv preprint, arXiv:1412.6980
Google Scholar
Virustotal. https://www.virustotal.com/
Liu K, Li Y, Xu N, Learn to combine modalities in multimodal deep learning. arXiv preprint, arXiv:1805.11730
Google Scholar
Rieck K, Trinius P, Willems C, Holz T (2011) Automatic analysis of malware behavior using machine learning. J Comp Secur 19:639–668
Article Google Scholar
Alazab M (2015) Profiling and classifying the behaviour of malicious codes. J Syst Softw, Elsevier 100:91–102
Article Google Scholar

Download references

Author information

Authors and Affiliations

Sophos Group PLC, Abingdon, UK
William Younghoo Lee, Joshua Saxe & Richard Harang

Authors

William Younghoo Lee
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Saxe
View author publications
You can also search for this author in PubMed Google Scholar
Richard Harang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William Younghoo Lee .

Editor information

Editors and Affiliations

Charles Darwin University, Casuarina, NT, Australia
Mamoun Alazab
Singtel Optus, Sydney, NSW, Australia
MingJian Tang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lee, W.Y., Saxe, J., Harang, R. (2019). SeqDroid: Obfuscated Android Malware Detection Using Stacked Convolutional and Recurrent Neural Networks. In: Alazab, M., Tang, M. (eds) Deep Learning Applications for Cyber Security. Advanced Sciences and Technologies for Security Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-13057-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-13057-2_9
Published: 15 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13056-5
Online ISBN: 978-3-030-13057-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics