Abstract
Due to its major threats to Internet security, malware detection is of great interest to both the anti-malware industry and researchers. Currently, features beyond file content are starting to be leveraged for malware detection (e.g., file-to-file relations), which provide invaluable insight about the properties of file samples. However, we still have much to understand about the relationships of malware and benign files. In this paper, based on the file-to-file relation network, we design several new and robust graph-based features for malware detection and reveal its relationship characteristics. Based on the designed features and two findings, we first apply Malicious Score Inference Algorithm (MSIA) to select the representative samples from the large unknown file collection for labeling, and then use Belief Propagation (BP) algorithm to detect malware. To the best of our knowledge, this is the first investigation of the relationship characteristics for the file-to-file relation network in malware detection using social network analysis. A comprehensive experimental study on a large collection of file sample relations obtained from the clients of anti-malware software of Comodo Security Solutions Incorporation is performed to compare various malware detection approaches. Promising experimental results demonstrate that the accuracy and efficiency of our proposed methods outperform other alternate data mining based detection techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)
Chau, D., Nachenberg, C., Wilhelm, J., Wright, A., Faloutsos, C.: Polonium: tera-scale graph mining for malware detection. In: SIAM International Conference on Data Mining (SDM), pp. 131–142 (2011)
Chen, L., Li, T., Abdulhayoglu, M., Ye, Y.: Intelligent malware detection based on file relation graphs. In: 9th IEEE International Conference on Sematic Computing, pp. 85–92 (2015)
Chen, K., Zhu, P., Xiong, Y.: Mining spam accounts with user influence. In: International Conference on ISCC-C, pp. 167–173 (2013)
Computer Security Institute: 12th annual edition of the CSI computer crime and security survey. Technical report, Computer Security Institute (2007)
Diestel, R.: Graph Theory, vol. 173, 4th edn. Springer, Heidelberg (2010)
Egele, M., Scholte, T., Kirda, E., Kruegel, C.: A survey on automated dynamic malware analysis techniques and tools. ACM CSUR 44(2), 6:1–6:42 (2008)
Filiol, E., Jacob, G., Liard, M.L.: Evaluation methodology and theoretical model for antiviral behavioural detection strategies. J. Comput. Virol 3(1), 27–37 (2007)
Hu, X., Tang, J., Zhang, Y., Liu, H.: Social spammer detection in microblogging. In: Proceedings of the 23rd IJCAI, pp. 2633–2639 (2013)
Karampatziakis, N., Stokes, J.W., Thomas, A., Marinescu, M.: Using file relationships in malware classification. In: Flegel, U., Markatos, E., Robertson, W. (eds.) DIMVA 2012. LNCS, vol. 7591, pp. 1–20. Springer, Heidelberg (2013)
Kephart, J., Arnold, W.: Automatic extraction of computer virus signatures. In: Proceedings of 4th Virus Bulletin International Conference, pp. 178–184 (1994)
Lin, C., Zhou, Y., Chen, K., He, J., Yang, X., Song, L.: Analysis and identification of spamming behaviors in Sina Weibo microblog. In: SNAKDD 2013 (2013)
Masud, M.M., Al-Khateeb, T.M., Hamlen, K.W., Gao, J., Khan, L., Han, J., Thuraisingham, B.: Cloud-based malware detection for evolving data streams. ACM TMIS 2(3), 16:1–16:27 (2008)
Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM, pp. 29–42 (2007)
Moh, T.-S., Murmann, A.J.: Can you judge a man by his friends? Enhancing spammer detection on the twitter microblogging platform using friends and followers. In: Prasad, S.K., Vin, H.M., Sahni, S., Jaiswal, M.P., Thipakorn, B. (eds.) ICISTM 2010. CCIS, vol. 54, pp. 210–220. Springer, Heidelberg (2010)
Noorshams, N., Wainwright, M.J.: Belief propagation for continuous state spaces: stochastic message-passing with quantitative guarantees. J. Mach. Learn. Res. 14(1), 2799–2835 (2013)
Park, Y., Zhang, Q., Reeves, D., Mulukutla, V.: AntiBot: clustering common semantic patterns for bot detection. In: IEEE 34th Annual Computer Software and Applications Conference, pp. 262–272 (2010)
Scott, J.: Social Networks Analysis: A Hand Book, 2nd edn. SAGE Publications Ltd, Thousand Oaks (2000)
Sung, A., Xu, J., Chavez, P., Mukkamala, S.: Static analyzer of vicious executables (save). In: Proceedings of the 20th ACSAC, pp. 326–334 (2004)
Tang, R., Lu, L., Zhuang, Y., Fong, S.: Not every friend on a social network can be trusted: an online trust indexing algorithm. In: IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), pp. 280–285 (2012)
Ting, I.H., Wang, S.L.: Content matters: a study of hate groups detection based on social networks analysis and web mining. In: IEEE/ACM ASONAM, pp. 1196–1201 (2013)
Tamersoy, A., Roundy, K.A., Chau, D.: Guilt by association: large scale malware detection by mining file-relation graphs. In: ACM SIGKDD (2014)
Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of the Third ACM WSDM, pp. 261–270 (2010)
Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? Empirical evaluation and new design for fighting evolving twitter spammers. In: Proceedings of the 14th International Conference on Recent Advances in Intrusion Detection, pp. 318–337 (2011)
Yang, C., Harkreader, R., Zhang, J., Shin, S., Gu, G.: Analyzing spammer’s social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In: Proceedings of the 21st International Conference on World Wide Web (WWW 2012), pp. 71–80 (2012)
Ye, Y., Wang, D., Li, T., Ye, D., Jiang, Q.: An intelligent PE-malware detection system based on association mining. J. Comput. Virol. 4, 323–334 (2008)
Ye, Y., Wang, D., Li, T., Ye, D.: IMDS: Intelligent malware detection system. In: Proceedings of the 13th ACM SIGKDD, pp. 1043–1047 (2007)
Ye, Y., Li, T., Zhu, S., Zhuang, W., Tas, E., Gupta, U., Abdulhayoglu, M.: Combining file content and file relations for cloud based malware detection. In: Proceedings of the 17th ACM SIGKDD, pp. 222–230 (2011)
Yedidia, J. S., Freeman, W.T., Weiss, Y.: Understanding belief propagation and its generalizations. Mltsubishl Electric Research Laboratories (2001)
Zhang, C., Niu, K., He, Z.: Dynamic detection of spammers in Weibo. In: 4th IEEE IC-NIDC, pp. 112–116 (2014)
Acknowledgments
The authors would also like to thank the anti-malware experts of Comodo Security Lab for the data collection, as well as the helpful discussions and supports.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, L., Hardy, W., Ye, Y., Li, T. (2015). Analyzing File-to-File Relation Network in Malware Detection. In: Wang, J., et al. Web Information Systems Engineering – WISE 2015. WISE 2015. Lecture Notes in Computer Science(), vol 9418. Springer, Cham. https://doi.org/10.1007/978-3-319-26190-4_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-26190-4_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26189-8
Online ISBN: 978-3-319-26190-4
eBook Packages: Computer ScienceComputer Science (R0)