L(a)ying in (Test)Bed

Beppler, Tamy; Botacin, Marcus; Ceschin, Fabrício J. O.; Oliveira, Luiz E. S.; Grégio, André

doi:10.1007/978-3-030-30215-3_19

Tamy Beppler¹¹,
Marcus Botacin¹¹,
Fabrício J. O. Ceschin¹¹,
Luiz E. S. Oliveira¹¹ &
…
André Grégio¹¹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11723))

Included in the following conference series:

International Conference on Information Security

1006 Accesses
6 Citations
2 Altmetric

Abstract

The number of malware variants released daily turned manual analysis into an impractical task. Although potentially faster, automated analysis techniques (e.g., static and dynamic) have shortcomings that are exploited by malware authors to thwart each of them, i.e., prevent malicious software from being detected or classified accordingly. Researchers then invested in traditional machine learning algorithms to try to produce efficient, effective classification methods. The produced models are also prone to errors and attacks. Novel representations of the “subject” were proposed to overcome previous limitations, such as malware textures. In this paper, our initial proposal was to evaluate the application of texture analysis for malware classification using samples collected in-the-wild in order to compare them with state-of-the-art results. During our tests, we discovered that texture analysis may be unfeasible for the task at hand, if we use the same malware representation employed by other authors. Furthermore, we also discovered that naive premises associated to the selection of samples in the datasets caused the introduction of biases that, in the end, produced unreal results. Finally, our tests with a broader unfiltered dataset show that texture analysis may be impractical for correct malware classification in a real world scenario, in which there is a great variety of families and some of them make use of quite sophisticate obfuscation techniques.

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Additional information about samples will be available after acceptance to do not violate the conference blindness requirement.

References

Al-Anezi, M.M.K.: Generic packing detection using several complexity analysis for accurate malware detection. Int. J. Adv. Comput. Sci. 5(1) (2014)
Google Scholar
Awad, R.A., Sayre, K.D.: Automatic clustering of malware variants. In: Intelligence and Security Informatics (ISI), pp. 298–303. IEEE (2016)
Google Scholar
Bertolini, D., Oliveira, L.S., Justino, E., Sabourin, R.: Texture-based descriptors for writer identification and verification. Expert Syst. Appl. 40, 2069–2080 (2013)
Article Google Scholar
Conti, G., et al.: Automated mapping of large binary objects using primitive fragment type classification. Digit. Investig. 7, S3–S12 (2010)
Article Google Scholar
Costa, Y.M., Oliveira, L., Koerich, A.L., Gouyon, F., Martins, J.: Music genre classification using LBP textural features. Signal Process. 92, 2723–2737 (2012)
Article Google Scholar
Damodaran, A., Di Troia, F., Visaggio, C.A., Austin, T.H., Stamp, M.: A comparison of static, dynamic, and hybrid analysis for malware detection. J. Comput. Virol. Hack. Tech. 13, 1–12 (2017)
Article Google Scholar
Kabanga, E.K., Kim, C.H.: Malware images classification using convolutional neural network. J. Comput. Commun. 6, 153 (2017)
Article Google Scholar
Kosmidis, K., Kalloniatis, C.: Machine learning and images for malware detection and classification. In: Pan-Hellenic Conference on Informatics. ACM (2017)
Google Scholar
Laks: Sarvam blog (2014). http://sarvamblog.blogspot.com.br
Li, P., Liu, L., Gao, D., Reiter, M.K.: On challenges in evaluating malware clustering. In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 238–255. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15512-3_13
Chapter Google Scholar
Luo, J.S., Lo, D.C.T.: Binary malware image classification using machine learning with local binary pattern. In: IEEE Big Data (2017)
Google Scholar
Makandar, A., Patrot, A.: Malware analysis and classification using artificial neural network. In: I-TACT (2015)
Google Scholar
Makandar, A., Patrot, A.: An approach to analysis of malware using supervised learning classification. In: International Conference on Recent Trends in Engineering, Science and Technology (2016)
Google Scholar
Makandar, A., Patrot, A.: Malware class recognition using image processing techniques. In: ICDMAI (2017)
Google Scholar
Makandar, A., Patrot, A.: Malware image analysis and classification using support vector machine. Int. J. Trends CS Eng. 4, 01–03 (2015)
Google Scholar
Makandar, A., Patrot, A.: Wavelet statistical feature based malware class recognition and classification using supervised learning classifier. Orient. J. CS Technol. 10, 400–406 (2017)
Article Google Scholar
Makandar, A., Patrot, A.: Trojan malware image pattern classification. In: Guru, D.S., Vasudev, T., Chethan, H.K., Sharath Kumar, Y.H. (eds.) Proceedings of International Conference on Cognition and Recognition. LNNS, vol. 14, pp. 253–262. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-5146-3_24
Chapter Google Scholar
Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: 23rd Annual Computer Security Applications Conference (2007)
Google Scholar
Nataraj, L.: A signal processing approach to malware analysis. UCSB (2015)
Google Scholar
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.: Malware images: visualization and automatic classification. In: International Symposium on Visualization for Cyber Security. ACM (2011)
Google Scholar
Nataraj, L., Kirat, D., Manjunath, B., Vigna, G.: SARVAM: search and retrieval of malware. In: ACSAC NGMAD (2013)
Google Scholar
Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: Workshop on Security and AI. ACM (2011)
Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002)
Article Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. ML Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Rezende, E., Ruppert, G., Carvalho, T., Ramos, F., de Geus, P.: Malicious software classification using transfer learning of ResNet-50 deep neural network. In: ICMLA (2017)
Google Scholar
Rezende, E., Ruppert, G., Carvalho, T., Theophilo, A., Ramos, F., Geus, P.: Malicious software classification using VGG16 deep neural network’s bottleneck features. In: Latifi, S. (ed.) Information Technology - New Generations. AISC, vol. 738, pp. 51–59. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77028-4_9
Chapter Google Scholar
Rossow, C., et al.: Prudent practices for designing malware experiments: status quo and outlook. In: S&P. IEEE (2012)
Google Scholar
Sebastián, M., Rivera, R., Kotzias, P., Caballero, J.: AVclass: a tool for massive malware labeling. In: Monrose, F., Dacier, M., Blanc, G., Garcia-Alfaro, J. (eds.) RAID 2016. LNCS, vol. 9854, pp. 230–253. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45719-2_11
Chapter Google Scholar
Singh, A.: Malware classification using image representation. Master’s thesis. Indian Institute of Technology Kanpur (2017)
Google Scholar
Thakare, V.S., Patil, N.N., Sonawane, J.S.: Survey on image texture classification techniques. Int. J. Adv. Technol. 4, 97–104 (2013)
Article Google Scholar
VirusTotal: Virustotal (2017). https://www.virustotal.com/#/home/upload
van der Walt, S., et al.: The scikit-image contributors: scikit-image: image processing in Python. PeerJ (2014)
Google Scholar
Yakura, H., Shinozaki, S., Nishimura, R., Oyama, Y., Sakuma, J.: Malware analysis of imaged binary samples by convolutional neural network with attention mechanism. In: Conference on Data and Application Security and Privacy, CODASPY 2018. ACM (2018)
Google Scholar
Yue, S.: Imbalanced malware images classification: a CNN based approach. CoRR (2017). http://arxiv.org/abs/1708.08042
Zhang, J., Qin, Z., Yin, H., Ou, L., Xiao, S., Hu, Y.: Malware variant detection using opcode image recognition with small training sets. In: ICCCN. IEEE (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Federal University of Paraná (UFPR), Curitiba, PR, Brazil
Tamy Beppler, Marcus Botacin, Fabrício J. O. Ceschin, Luiz E. S. Oliveira & André Grégio

Authors

Tamy Beppler
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Botacin
View author publications
You can also search for this author in PubMed Google Scholar
Fabrício J. O. Ceschin
View author publications
You can also search for this author in PubMed Google Scholar
Luiz E. S. Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
André Grégio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to André Grégio .

Editor information

Editors and Affiliations

The Ohio State University, Columbus, OH, USA
Zhiqiang Lin
University of Maryland, College Park, MD, USA
Charalampos Papamanthou
Stony Brook University, Stony Brook, NY, USA
Michalis Polychronakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Beppler, T., Botacin, M., Ceschin, F.J.O., Oliveira, L.E.S., Grégio, A. (2019). L(a)ying in (Test)Bed. In: Lin, Z., Papamanthou, C., Polychronakis, M. (eds) Information Security. ISC 2019. Lecture Notes in Computer Science(), vol 11723. Springer, Cham. https://doi.org/10.1007/978-3-030-30215-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-30215-3_19
Published: 02 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30214-6
Online ISBN: 978-3-030-30215-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics