Malware Triage Based on Static Features and Public APT Reports

  • Giuseppe Laurenza
  • Leonardo Aniello
  • Riccardo Lazzeretti
  • Roberto Baldoni
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10332)


Understanding the behavior of malware requires a semi-automatic approach including complex software tools and human analysts in the loop. However, the huge number of malicious samples developed daily calls for some prioritization mechanism to carefully select the samples that really deserve to be further examined by analysts. This avoids computational resources be overloaded and human analysts saturated. In this paper we introduce a malware triage stage where samples are quickly and automatically examined to promptly decide whether they should be immediately dispatched to human analysts or to other specific automatic analysis queues, rather than following the common and slow analysis pipeline. Such triage stage is encapsulated into an architecture for semi-automatic malware analysis presented in a previous work. In this paper we propose an approach for sample prioritization, and its realization within such architecture. Our analysis in the paper focuses on malware developed by Advanced Persistent Threats (APTs). We build our knowledge base, used in the triage, on known APTs obtained from publicly available reports. To make the triage as fast as possible, only static malware features are considered, which can be extracted with negligible delay, without the necessity of executing the malware samples, and we use them to train a random forest classifier. The classifier has been tuned to maximize its precision, so that analysts and other components of the architecture are mostly likely to receive only malware correctly identified as being similar to known APT, and do not waste important resources on false positives. A preliminary analysis shows high precision and accuracy, as desired.


Malware analysis Advanced Persistent Threats Static analysis Malware triage 



This present work has been partially supported by a grant of the Italian Presidency of Ministry Council, and by CINI Cybersecurity National Laboratory within the project FilieraSicura: Securing the Supply Chain of Domestic Critical Infrastructures from Cyber Attacks ( funded by CISCO Systems Inc. and Leonardo SpA.


  1. 1.
    IOC parser. Accessed 17 Mar 2017
  2. 2.
    MongoDB. Accessed 13 Mar 2017
  3. 3.
    PEFrame. Accessed 17 Mar 2017
  4. 4.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  5. 5.
    Chen, P., Desmet, L., Huygens, C.: A study on advanced persistent threats. In: Decker, B., Zúquete, A. (eds.) CMS 2014. LNCS, vol. 8735, pp. 63–72. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44885-4_5 Google Scholar
  6. 6.
    CNN: Nearly 1 million new malware threats released every day (2014). attack-hacks-security/
  7. 7.
    Damodaran, A., Di Troia, F., Visaggio, C.A., Austin, T.H., Stamp, M.: A comparison of static, dynamic, and hybrid analysis for malware detection. J. Comput. Virol. Hacking Tech. 13, 1–12 (2015)CrossRefGoogle Scholar
  8. 8.
    Fireeye: FireEye labs obfuscated string solver. Accessed 17 Mar 2017
  9. 9.
    Islam, R., Tian, R., Batten, L.M., Versteeg, S.: Classification of malware based on integrated static and dynamic features. J. Netw. Comput. Appl. 36(2), 646–656 (2013)CrossRefGoogle Scholar
  10. 10.
    Jang, J., Brumley, D., Venkataraman, S.: BitShred: Fast, scalable malware triage. Technical report CMU-Cylab-10-022, Cylab, Carnegie Mellon University, Pittsburgh, PA (2010)Google Scholar
  11. 11.
    Jeun, I., Lee, Y., Won, D.: A practical study on advanced persistent threats. In: Kim, T., Stoica, A., Fang, W., Vasilakos, T., Villalba, J.G., Arnett, K.P., Khan, M.K., Kang, B.-H. (eds.) SecTech 2012. CCIS, vol. 339, pp. 144–152. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-35264-5_21 CrossRefGoogle Scholar
  12. 12.
    Khodamoradi, P., Fazlali, M., Mardukhi, F., Nosrati, M.: Heuristic metamorphic malware detection based on statistics of assembly instructions using classification algorithms. In: 2015 18th CSI International Symposium on Computer Architecture and Digital Systems (CADS), pp. 1–6. IEEE (2015)Google Scholar
  13. 13.
    Kirat, D., Nataraj, L., Vigna, G., Manjunath, B.: SigMal: a static signal processing based malware triage. In: Proceedings of the 29th Annual Computer Security Applications Conference. pp. 89–98. ACM (2013)Google Scholar
  14. 14.
    Lakhotia, A., Walenstein, A., Miles, C., Singh, A.: VILO: a rapid learning nearest-neighbor classifier for malware triage. J. Comput. Virol. Hacking Tech. 9(3), 109–123 (2013)CrossRefGoogle Scholar
  15. 15.
    Laurenza, G., Ucci, D., Aniello, L., Baldoni, R.: An architecture for semi-automatic collaborative malware analysis for CIs. In: 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop, pp. 137–142. IEEE (2016)Google Scholar
  16. 16.
    Marchetti, M., Pierazzi, F., Colajanni, M., Guido, A.: Analysis of high volumes of network traffic for advanced persistent threat detection. Comput. Netw. 109, 127–141 (2016)CrossRefGoogle Scholar
  17. 17.
    Trend Micro: IXESHE: an APT campaign. Trend Micro Incorporated Research Paper (2012)Google Scholar
  18. 18.
    MITRE: CRITS: collaborative research into threats. Accessed 17 Mar 2017
  19. 19.
    Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Twenty-Third Annual Computer Security Applications Conference, ACSAC 2007, pp. 421–430. IEEE (2007)Google Scholar
  20. 20.
    O’Gorman, G., McDonald, G.: The elderwood project. Symantec Whitepaper (2012)Google Scholar
  21. 21.
    R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2008). ISBN 3-900051-07-0,
  22. 22.
    Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: OPEM: a static-dynamic approach for machine-learning-based malware detection. In: Herrero, Á., et al. (eds.) International Joint Conference CISIS’12-ICEUTE’12-SOCO’12 Special Sessions. AISC, pp. 271–280. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-33018-6_28 CrossRefGoogle Scholar
  23. 23.
    Su, Y., Lib, M., Tang, C., Shen, R.: A framework of APT detection based on dynamic analysis (2016)Google Scholar
  24. 24.
    Tankard, C.: Advanced persistent threats and how to monitor and deter them. Netw. Secur. 2011(8), 16–19 (2011)CrossRefGoogle Scholar
  25. 25.
    Villeneuve, N., Bennett, J.T., Moran, N., Haq, T., Scott, M., Geers, K.: Operation “KE3CHANG” Targeted Attacks Against Ministries of Foreign Affairs (2013)Google Scholar
  26. 26.
    Virvilis, N., Gritzalis, D., Apostolopoulos, T.: Trusted computing vs. advanced persistent threats: can a defender win this game? In: 2013 IEEE 10th International Conference on Ubiquitous Intelligence and Computing and 10th International Conference on Autonomic and Trusted Computing (UIC/ATC), pp. 396–403. IEEE (2013)Google Scholar
  27. 27.
    Vukalović, J., Delija, D.: Advanced persistent threats-detection and defense. In: 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1324–1330. IEEE (2015)Google Scholar
  28. 28.
    Wicherski, G.: peHash: a novel approach to fast malware clustering. LEET 9, 8 (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Giuseppe Laurenza
    • 1
  • Leonardo Aniello
    • 1
  • Riccardo Lazzeretti
    • 1
  • Roberto Baldoni
    • 1
    • 2
  1. 1.Department of Computer and System Sciences “Antonio Ruberti”, Research Center of Cyber Intelligence and Information Security (CIS)Sapienza Università di RomaRomeItaly
  2. 2.CINI Cybersecurity National LaboratoryRomeItaly

Personalised recommendations