A multi-view context-aware approach to Android malware detection and malicious code localization

Narayanan, Annamalai; Chandramohan, Mahinthan; Chen, Lihui; Liu, Yang

doi:10.1007/s10664-017-9539-8

A multi-view context-aware approach to Android malware detection and malicious code localization

Published: 30 August 2017

Volume 23, pages 1222–1274, (2018)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Annamalai Narayanan ORCID: orcid.org/0000-0001-8452-3703¹,
Mahinthan Chandramohan¹,
Lihui Chen¹ &
…
Yang Liu¹

1658 Accesses
57 Citations
4 Altmetric
Explore all metrics

Abstract

Many existing Machine Learning (ML) based Android malware detection approaches use a variety of features such as security-sensitive APIs, system calls, control-flow structures and information flows in conjunction with ML classifiers to achieve accurate detection. Each of these feature sets provides a unique semantic perspective (or view) of apps’ behaviors with inherent strengths and limitations. Meaning, some views are more amenable to detect certain attacks but may not be suitable to characterize several other attacks. Most of the existing malware detection approaches use only one (or a selected few) of the aforementioned feature sets which prevents them from detecting a vast majority of attacks. Addressing this limitation, we propose MKLDroid, a unified framework that systematically integrates multiple views of apps for performing comprehensive malware detection and malicious code localization. The rationale is that, while a malware app can disguise itself in some views, disguising in every view while maintaining malicious intent will be much harder. MKLDroid uses a graph kernel to capture structural and contextual information from apps’ dependency graphs and identify malice code patterns in each view. Subsequently, it employs Multiple Kernel Learning (MKL) to find a weighted combination of the views which yields the best detection accuracy. Besides multi-view learning, MKLDroid’s unique and salient trait is its ability to locate fine-grained malice code portions in dependency graphs (e.g., methods/classes). Malicious code localization caters several important applications such as supporting human analysts studying malware behaviors, engineering malware signatures, and other counter-measures. Through our large-scale experiments on several datasets (incl. wild apps), we demonstrate that MKLDroid outperforms three state-of-the-art techniques consistently, in terms of accuracy while maintaining comparable efficiency. In our malicious code localization experiments on a dataset of repackaged malware, MKLDroid was able to identify all the malice classes with 94% average recall. Our work opens up two new avenues in malware research: (i) enables the research community to elegantly look at Android malware behaviors in multiple perspectives simultaneously, and (ii) performing precise and scalable malicious code localization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

How different are different diff algorithms in Git?

Article Open access 11 September 2019

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

Article 08 April 2024

AndroMalPack: enhancing the ML-based malware classification by detection and removal of repacked apps for Android systems

Article Open access 14 November 2022

Notes

The detailed procedure for constructing the ADG is provided later in Section 3.2.
Two existing works, PScout (Au et al. 2012) and SUSI (Rasthofer et al. 2014) listcommonly known security-sensitive Android APIs. We use these two lists to identify sensitiveAPIs.
This follows from the observation that in most malware the malice code portion is closely-kniti.e., spanning only to a few methods. We also attempted two other variants of CADG. Wereduce the path in CICFG to edges in CADG (i) only if the calling and called nodes belong tothe same package and (ii) only if the calling and called nodes belong to the same class. Boththese variants contained much larger number of edges and also failed to capture the attacks aseffectively as the CADG defined above (experimentally verified).
PScout (Au et al. 2012) provides a mapping from Android APIs and URIs to permissionsrequired to access them. Furthermore, we infer the usage of intents, reflection and native code through relevant APIs and consider them as using special permissions. We use these mappingsto build CPDGs.
To identify information sources and sinks accessed in CICFG nodes, we leverage on SUSI(Rasthofer et al. 2014) and MUDFLOW (Avdiienko et al. 2015). Together, these worksmap Android APIs and URIs to 15 source and 18 sink categories.
To determine the categories of Dalvik instructions to be used as CICFG_insnode labels, werefer to Adagio(Gascon et al. 2013). The authors manually analyzed and categorized allthe instructions into 15 distinct categories (such as move, invoke, etc.).
Pouik et al. (2012) leveraged on a grammar proposed by Cesare and Xiang (2010) to representCFG textual signatures in their work on establishing similarity between Android apps.
The reason why such an issue rises only in the case of CICFG_ins and CICFG_signs is understandable. That is, in the case of CADG, CPDG and CSSDG, the number of unique node labels is limited by the APIs, permissions, information source and sink categories available. Consequently, limited contextual neighborhood labels to emerge from the relabeling process and thereby limiting the size of the vocabulary. However, in the case of CICFG_ins and CICFG_signs, the number of unique node labels (i.e., the number of unique instruction sequence and CFG signatures, respectively) across the whole dataset is extremely large, leading to mammoth vocabulary Σ^∗.
From (9), it could be noted that, the prediction made in this fashion will be equivalent to one made with a linear SVM learnt as an optimization on \(\overrightarrow {\mathsf {\textbf {W}}}\) as follows:\( \min _{\overrightarrow {\mathsf {\textbf {W}}}} ||\overrightarrow {\mathsf {\textbf {W}}}||^2 + {\sum }_{i=1}^N max(0,1-y^{(i)}f(\overrightarrow {\mathsf {\textbf {X}}}^{(i)}))\).
https://www.virustotal.com
Recently, Li et al. (2017a) provided a dataset of repackaged apps of the form: (a p p1,a p p2), where a p p1 is the original (benign) app and a p p2 is the repackaged version of a p p1. However, they do not ascertain whether or not the new code injected in a p p2 is malicious. In fact, exploring this dataset, we observe that a majority of the repackaged apps were adware or other type of PHAs. Hence, we refrain from using this dataset which lacks precise ground truth labels on malice methods and classes in our experiments.
More than 80% of samples in this dataset are piggybacked malware thus making this dataset amenable for our qualitative analysis (Li et al. 2017a).
Remember, we intend to avoid computing expensive data-flows in the app and believe other views (computed at much lesser expense) would complement and mitigate the absence of data-flow related features.
Though Adagio, in principle could identify malice methods from CGs, the implementation provided at Adagio (2017) does not include this.
MD5: 1944d8ee5bdda3a1bd06555fdb10d3267ab0cc4511d1e40611baf3ce1b81e5e8
In this context, the leaks through internet is considered akin to writing into a file and hence we see a FILE sink instead of a NETWORK sink.
MD5: 7bbd566f2f3abb78b3ffcc23ba4ad84e06a00f758d245c660c61b21814a850a5
As discussed in Arp et al. (2014), Avdiienko et al. (2015), Garcia et al. (2015), Yang et al. (2014), Kimberly et al. (2017) performing precise data-flow and dynamic analysis to extract features is computationally heavy.

References

Aafer Y et al (2013) DroidAPIMiner: Mining API-level features for robust malware detection in android. In: International conference on security and privacy in communication systems. Springer International Publishing
Adagio source code. https://github.com/hgascon/adagio (accessed April 2017)
Allix K et al (2014) Machine learning-based malware detection for android applications: history matters! University of Luxembourg SnT, Luxembourg
Google Scholar
Allix K et al (2016a) Empirical assessment of machine learning-based malware detectors for Android. Empirical Softw Eng 21(1):183–211
Article Google Scholar
Allix K et al (2016b) Androzoo: Collecting millions of android apps for the research community. In: Proceedings of the 13th international conference on mining software repositories . ACM
Android Drawer third-party market. https://www.androiddrawer.com (accessed April 2017)
Androguard. https://code.google.com/p/androguard (accessed April 2017)
Anzhi third-party market. www.anzhi.com (accessed April 2017)
AppsApk third-party market. http://www.appsapk.com/ (accessed April 2017)
Arp D et al (2014) Drebin: Effective and explainable detection of android malware in your pocket. In: Proceedings of the annual symposium on network and distributed system security (NDSS)
Arzt S et al (2014) Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. Acm Sigplan Not 49(6):259–269
Article Google Scholar
Au KWY et al (2012) Pscout: analyzing the android permission specification. In: Proceedings of the 2012 ACM conference on computer and communications security. ACM
Avdiienko V et al (2015) Mining apps for abnormal usage of sensitive data. In: 2015 IEEE/ACM 37th IEEE international conference on software engineering (ICSE), vol 1. IEEE
Biggio B et al (2014) Poisoning behavioral malware clustering. In: Proceedings of the 2014 workshop on artificial intelligent and security workshop. ACM
Borgwardt KM, Kriegel H-P (2005) Shortest-path kernels on graphs. In: Fifth IEEE international conference on data mining (ICDM’05). IEEE
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Article Google Scholar
Burguera I et al (2011) Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices. ACM
Cesare S, Xiang Y (2010) Classification of malware using structured control flow. In: Proceedings of the eighth Australasian aymposium on parallel and distributed computing, vol 107. Australian Computer Society, Inc., pp 61–70
Chakradeo S et al (2013) Mast: triage for market-scale mobile malware analysis. In: Proceedings of the sixth ACM conference on security and privacy in wireless and mobile networks. ACM
Chen K et al (2015) Finding unknown malice in 10 seconds: Mass vetting for new threats at the google-play scale. In: 24th USENIX security symposium (USENIX Security 15)
Dash SK et al (2016) DroidScribe: Classifying android malware based on runtime behavior. Mob Secur Technol (MoST 2016) 7148:1–12
Google Scholar
Deo A et al (2016) Prescience: Probabilistic guidance on the retraining conundrum for malware detection. In: Proceedings of the 2016 ACM workshop on artificial intelligence and security. ACM
Elish KO et al (2015) Profiling user-trigger dependence for Android malware detection. Comput Secur 49:255–273
Article Google Scholar
Enrico M et al (2016) MAMADROID: Detecting android malware by building markov chains of behavioral models. arXiv:1612.04433
FDroid third-party market. www.fdroid.org (accessed April 2017)
Fredrikson M et al (2010) Synthesizing near-optimal malware specifications from suspicious behaviors. In: 2010 IEEE symposium on security and privacy (SP). IEEE
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
MATH Google Scholar
Garcia J et al (2015) Obfuscation-resilient, efficient, and accurate detection and family identification of android malware. Department of Computer Science, George Mason University, Technical Report, USA
Google Scholar
Gärtner T et al (2003) On graph kernels: hardness results and efficient alternatives. Learning theory and kernel machines. Springer, Berlin Heidelberg, pp 129–143
Book MATH Google Scholar
Google Play. https://play.google.com/store (accessed April 2017)
Gönen M, Alpaydın E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268
MathSciNet MATH Google Scholar
Gorla A et al (2014) Checking app behavior against app descriptions. In: Proceedings of the 36th international conference on software engineering. ACM
Gordon MI et al (2015) Information flow analysis of android applications in droidSafe. NDSS
Gascon H et al (2013) Structural detection of android malware using embedded call graphs. In: Proceedings of the 2013 ACM workshop on artificial intelligence and security. ACM
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
MATH Google Scholar
Hassen M, Chan PK (2017) Scalable Function Call Graph-based Malware Classification
Hido S, Kashima H (2009) A linear-time graph kernel. In: Ninth IEEE international conference on data mining 2009. ICDM’09. IEEE
Hoi SCH et al (2013) Online multiple kernel classification. Mach Learn 90 (2):289–316
Article MathSciNet MATH Google Scholar
Kantchelian A et al (2013) Approaches to adversarial drift. In: Proceedings of the 2013 ACM workshop on artificial intelligence and security. ACM
Kimberly T et al (2017) The evolution of android malware and android analysis techniques. ACM Comput Surv (CSUR) 49(4):76
Google Scholar
Kriege NM et al (2017) A unifying view of explicit and implicit feature maps for structured data: systematic studies of graph kernels. arXiv:1703.00676
Kaspersky 2016 Threat Report. https://kasperskycontenthub.com/securelist/files/2016/12/Kaspersky_Security_Bulleti_2016_Review_ENG.eps (accessed April 2017)
Li L et al (2015) Iccta: Detecting inter-component privacy leaks in android apps. In: Proceedings of the 37th international conference on software engineering, vol 1. IEEE Press
Li L et al (2017a) Understanding android app piggybacking: a systematic study of malicious code grafting. In: IEEE transactions on information forensics and security
Li L et al (2017b) Automatically locating malicious packages in piggybacked android apps. In: Proceedings of the international workshop on mobile software engineering and systems. ACM
Ma J et al (2009) Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the 26th annual international conference on machine learning. ACM
Meng G et al (2016) Mystique: evolving android malware for auditing anti-malware tools. In: Proceedings of the 11th ACM on Asia conference on computer and communications security. ACM
Mu Z et al (2014) Semantics-aware Android malware classification using weighted contextual API dependency graphs. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. ACM
MKLDroid’s website: https://sites.google.com/view/mkldroid/home (accessed April 2017)
Narayanan A et al (2016a) Subgraph2vec: learning distributed representations of rooted sub-graphs from large graphs. In: Workshop on mining and learning with graphs
Narayanan A et al (2016b) Contextual weisfeiler-lehman graph kernel for malware detection. In: The 2016 international joint conference on neural networks (IJCNN). IEEE
Narayanan A et al (2016c) Adaptive and scalable android malware detection through online learning. In: The 2016 international joint conference on neural networks (IJCNN). IEEE
Octeau D et al (2015) Composite constant propagation: application to android inter-component communication analysis. In: Proceedings of the 37th international conference on software engineering, vol 1. IEEE Press
Peiravian N, Zhu X (2013) Machine learning for android malware detection using permission and api calls. In: 2013 IEEE 25th international conference on tools with artificial intelligence. IEEE
Pouik et al (2012) Similarities for fun & profit. Phrack 14, 68. http://www.phrack.org/issues.html?id=15&issue=68 (accessed April 2017)
Rasthofer S, Arzt S, Bodden E (2014) A machine-learning approach for classifying and categorizing android sources and sinks NDSS
Ribeiro MT et al (2016) Why Should I Trust You?: Explaining the predictions of any classifier. In: Proceedings of SIGKDD
Roy S et al (2015) Experimental study with real-world data for android app security analysis using machine learning. In: Proceedings of the 31st Annual Computer Security Applications Conference. ACM
Searles R et al (2017) Parallelization of machine learning applied to call graphs of binaries for malware detection. In: Proceedings of the 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2017, Russia
Sahs J, Khan L (2012) A machine learning approach to android malware detection. In: 2012 European Intelligence and Security Informatics Conference (EISIC). IEEE
Singh A et al (2012) Tracking concept drift in malware families. In: Proceedings of the 5th ACM Workshop on Security and Artificial Intelligence. ACM
Saracino A et al (2016) Madam: Effective and efficient behavior-based android malware detection and prevention. In: IEEE Transactions on Dependable and Secure Computing
Shervashidze N et al (2009) Efficient graphlet kernels for large graph comparison. AISTATS 5:488–495
Shervashidze N et al (2011) Weisfeiler-lehman graph kernels. J Mach Learn Res 12:2539–2561
MathSciNet MATH Google Scholar
SlideMe third-party market. www.SlideME.org (accessed April 2017)
Soot framework. http://sable.github.io/soot (accessed April 2017)
Suarez-Tangil G et al (2016) DroidSieve: Fast and Accurate Classification of Obfuscated Android Malware
Sun Z, Ampornpunt N, Varma M, Vishwanathan S (2010) Multiple kernel learning and the SMO algorithm. In: Advances in Neural Information Processing Systems, pp 2361–2369
Tian K e et al (2016) Analysis of code heterogeneity for high-precision classification of repackaged malware. In: 2016 IEEE Security and Privacy Workshops (SPW). IEEE
Virus Share - malware collection. https://virusshare.com (accessed April 2017)
Yanardag P, Vishwanathan S (2015) Deep graph kernels. In: Proceedings of SIGKDD
Xu L et al (2016) HADM: Hybrid analysis for detection of malware. SAI Intelligent Systems Conference (IntelliSys), UK
Google Scholar
Yang C et al (2014) Droidminer: Automated mining and characterization of fine-grained malicious behaviors in android applications. In: Computer Security-ESORICS 2014. Springer International Publishing, pp 163–182
Yang W et al (2015) Appcontext: Differentiating malicious and benign mobile app behaviors using context. In: Proceedings of the International Conference on Software Engineering (ICSE)
Zhou Y, Jiang X (2012) Dissecting android malware: Characterization and evolution. In: 2012 IEEE Symposium on Security and Privacy (SP). IEEE

Download references

Author information

Authors and Affiliations

Nanyang Technological University, Singapore, Singapore
Annamalai Narayanan, Mahinthan Chandramohan, Lihui Chen & Yang Liu

Authors

Annamalai Narayanan
View author publications
You can also search for this author in PubMed Google Scholar
Mahinthan Chandramohan
View author publications
You can also search for this author in PubMed Google Scholar
Lihui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Annamalai Narayanan.

Additional information

Communicated by: David Lo

Appendix: Qualitative Analysis of Base Kernels

The detection capabilities of the base kernels and kernel combinations could also be inferred by visualizing the kernel matrices. To this end, we present the kernel matrix of all the samples used in CE1 as a heat map in Fig. 7. The first (top-left) 5,000 samples are the benign apps from the GP1 collection and the subsequent (bottom-right) 5,600 samples malware from DR collection. Every cell in the kernel matrix represents the similarity value between a pair of apps. Dark and light shades in cells indicate low and high similarity values, respectively.

It could be clearly seen that the malware apps exhibit high similarities among them in all the views compared to the benign apps. This qualitative depiction reinforces the observations on homogeneity in DR collection that we discussed above. Also, the inferences on individual base kernel’s detection potentials discussed in RQ1.1 are observed qualitatively from Fig. 7a-e. For instance, the API kernels separates the malware and benign samples better than other base kernels. Also, the non-uniform linear combination of kernels learnt by SMO-MKL (Fig. 7g) offers the best separation between the samples of the two classes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Narayanan, A., Chandramohan, M., Chen, L. et al. A multi-view context-aware approach to Android malware detection and malicious code localization. Empir Software Eng 23, 1222–1274 (2018). https://doi.org/10.1007/s10664-017-9539-8

Download citation

Published: 30 August 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s10664-017-9539-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-view context-aware approach to Android malware detection and malicious code localization

Abstract

Access this article

Similar content being viewed by others

How different are different diff algorithms in Git?

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

AndroMalPack: enhancing the ML-based malware classification by detection and removal of repacked apps for Android systems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Qualitative Analysis of Base Kernels

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multi-view context-aware approach to Android malware detection and malicious code localization

Abstract

Access this article

Similar content being viewed by others

How different are different diff algorithms in Git?

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

AndroMalPack: enhancing the ML-based malware classification by detection and removal of repacked apps for Android systems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Qualitative Analysis of Base Kernels

Appendix: Qualitative Analysis of Base Kernels

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation