Leveraging Support Vector Machine for Opcode Density Based Detection of Crypto-Ransomware

  • James Baldwin
  • Ali DehghantanhaEmail author
Part of the Advances in Information Security book series (ADIS, volume 70)


Ransomware is a significant global threat, with easy deployment due to the prevalent ransomware-as-a-service model. Machine learning algorithms incorporating the use of opcode characteristics and Support Vector Machine have been demonstrated to be a successful method for general malware detection. This research focuses on crypto-ransomware and uses static analysis of malicious and benign Portable Executable files to extract 443 opcodes across all samples, representing them as density histograms within the dataset. Using the SMO classifier and PUK kernel in the WEKA machine learning toolset it demonstrates that this methodology can achieve 100% precision when differentiating between ransomware and goodware, and 96.5% when differentiating between five crypto-ransomware families and goodware. Moreover, eight different attribute selection methods are evaluated to achieve significant feature reduction. Using the CorrelationAttributeEval method close to 100% precision can be maintained with a feature reduction of 59.5%. The CFSSubset filter achieves the highest feature reduction of 97.7% however with a slightly lower precision at 94.2%.

Using a ranking method applied across the attribute selection evaluators, the opcodes with the highest predictive importance have been identified as FDIVP, AND, SETLE, XCHG, SETNBE, SETNLE, JB, FILD, JLE, POP, CALL, FSUB, FMUL, MUL, SETBE, FISTP, FSUBRP, INC, FIDIV, FSTSW, JA. The MOV and PUSH opcodes, represented in the dataset with significantly higher density, do not actually have high predictive importance, whereas some rarer opcodes such as SETBE and FIDIV do.


Malware Ransomware Ransomware detection Ransomware family detection 



The authors would like to Virus Total for providing access to their Intelligence platform to assist with the dataset creation, and Ransomware Tracker for being an invaluable resource for current ransomware threat detection.


  1. 1.
    McAfee Labs, ‘McAfee Labs Threats Report’, McAfee Labs Threat. Rep., no. December, pp. 1–52, 2016.Google Scholar
  2. 2.
    D. O’Brien, ‘Special Report: Ransomware and Businesses 2016’, Symantec Corp, pp. 1–30, 2016.Google Scholar
  3. 3.
    CERT UK, ‘Is ransomware still a threat ?’, 2016.Google Scholar
  4. 4.
    Bleeping Computer, ‘Criminals earn $195K in July with Cerber Ransomware Affiliate Scheme’, 2016. [Online]. Available: [Accessed: 28-Sep-2017].
  5. 5.
  6. 6.
    Cybersecurity Insiders, ‘2017 Ransomware Report’, 2017.Google Scholar
  7. 7.
    Symantec Official Blog, ‘What you need to know about the WannaCry Ransomware | Symantec Connect Community’, 2017. [Online]. Available: [Accessed: 28-Sep-2017].
  8. 8.
    Symantec Official Blog, ‘Petya ransomware outbreak: Here’s what you need to know|Symantec Connect Community’, 2017. [Online]. Available: [Accessed: 28-Sep-2017].
  9. 9.
    Darktrace, ‘Darktrace|Technology’, 2016. [Online]. Available: [Accessed: 31-Mar-2017].
  10. 10.
    RansomFlare, ‘MWR’s ransomware prevention and response service’, 2017. [Online]. Available: [Accessed: 28-Sep-2017].
  11. 11.
    Hamed Haddad Pajouh, A. Dehghantanha, R. Khayami, and K.-K. R. Choo, ‘Intelligent OS X Malware Threat Detection’, J. Comput. Virol. Hacking Tech., 2017.Google Scholar
  12. 12.
    N. Milosevic, A. Dehghantanha, and K.-K. R. Choo, ‘Machine learning aided Android malware classification’, Comput. Electr. Eng., vol. 61, pp. 266–274, Jul. 2017.Google Scholar
  13. 13.
    A. Azmoodeh, A. Dehghantanha, and K. K. R. Choo, ‘Robust Malware Detection for Internet Of (Battlefield) Things Devices Using Deep Eigenspace Learning’, IEEE Trans. Sustain. Comput., 2017.Google Scholar
  14. 14.
    DARK Reading, ‘The Growth And Growth Of Ransomware’, 5 Ways The Cyber-Threat Landscape Shifted In 2016, 2016. [Online]. Available: [Accessed: 01-Oct-2017].
  15. 15.
    I. Firdausi, C. Lim, A. Erwin, and a. S. Nugroho, ‘Analysis of Machine learning Techniques Used in Behavior-Based Malware Detection’, Adv. Comput. Control Telecommun. Technol. (ACT), 2010 Second Int. Conf., pp. 10–12, 2010.Google Scholar
  16. 16.
    K. Rieck, P. Trinius, C. Willems, and T. Holz, ‘Automatic Analysis of Malware Behavior Using Machine Learning’, J. Comput. Secur., vol. 19, no. 4, pp. 639–668, 2011.Google Scholar
  17. 17.
    M. Egele, T. Scholte, E. Kirda, and C. Kruegel, ‘A survey on automated dynamic malware-analysis techniques and tools’, ACM Comput. Surv., vol. 44, no. 2, pp. 1–42, 2012.Google Scholar
  18. 18.
    J. Landage and M. Wankhade, ‘Malware and Malware Detection Techniques: A Survey’, Int. J. Eng. Res. …, vol. 2, no. 12, pp. 61–68, 2013.Google Scholar
  19. 19.
    R. Islam, R. Tian, L. M. Batten, and S. Versteeg, ‘Classification of malware based on integrated static and dynamic features’, J. Netw. Comput. Appl., vol. 36, no. 2, pp. 646–656, Mar. 2013.Google Scholar
  20. 20.
    E. Gandotra, D. Bansal, and S. Sofat, ‘Tools & Techniques for Malware Analysis and Classification’, Int. J. NEXT-GENERATION Comput., vol. 7, no. 3, pp. 176–197, Nov. 2016.Google Scholar
  21. 21.
    D. Bilar, ‘Opcodes as predictor for malware’, Int. J. Electron. Secur. Digit. Forensics, vol. 1, no. 2, p. 156, 2007.Google Scholar
  22. 22.
    D. Bilar and D. Bilar, ‘Callgraph properties of executables’, AI Commun., vol. 20, no. August, p. 12, 2007.Google Scholar
  23. 23.
    Y. Ding, W. Dai, S. Yan, and Y. Zhang, ‘Control flow-based opcode behavior analysis for Malware detection’, Comput. Secur., vol. 44, pp. 65–74, Jul. 2014.Google Scholar
  24. 24.
    Z. Zhao, J. Wang, and J. Bai, ‘Malware detection method based on the control-flow construct feature of software’, IET Inf. Secur., vol. 8, no. 1, pp. 18–24, Jan. 2014.Google Scholar
  25. 25.
    S. Cesare, Y. Xiang, and W. Zhou, ‘Control Flow-Based Malware Variant Detection’, IEEE Trans. DEPENDABLE Secur. Comput., vol. 11, no. 4, pp. 304–317, 2014.Google Scholar
  26. 26.
    B. B. Rad, M. Masrom, and S. Ibrahim, ‘Opcodes Histogram for Classifying Metamorphic Portable Executables Malware’, in 2012 INTERNATIONAL CONFERENCE ON E-LEARNING AND E-TECHNOLOGIES IN EDUCATION (ICEEE), 2012, pp. 209–213.Google Scholar
  27. 27.
    P. O’Kane, S. Sezer, K. McLaughlin, and E. G. Im, ‘SVM Training phase reduction using dataset feature filtering for malware detection’, IEEE Trans. Inf. Forensics Secur., vol. 8, no. 3, pp. 500–509, 2013.Google Scholar
  28. 28.
    C.-T. Lin, N.-J. Wang, H. Xia, and C. Eckert, ‘Feature Selection and Extraction for Malware Classification’, J. Inf. Sci. Eng., vol. 31, no. 3, pp. 965–992, May 2015.Google Scholar
  29. 29.
    B. M. Khammas, A. Monemi, J. S. Bassi, I. Ismail, S. M. Nor, and M. N. Marsono, ‘FEATURE SELECTION AND MACHINE LEARNING CLASSIFICATION FOR MALWARE DETECTION’, J. Teknol., vol. 77, no. 1, Nov. 2015.Google Scholar
  30. 30.
    E. G. Park, Jeong Been; Han, Kyung Soo; Kim, Tae Gune; Im, ‘A Study on Selecting Key Opcodes for Malware Classification and Its Usefulness’, Korean Inst. Inf. Sci. Eng., vol. Volume 42, no. Issue 5, pp. 558–565, 2015.Google Scholar
  31. 31.
    C. T. D. Lo, O. Pablo, and C. Carlos, ‘Feature Selection and Improving Classification Performance for Malware Detection’, in PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, pp. 560–566.Google Scholar
  32. 32.
    D. Sgandurra, L. Muñoz-González, R. Mohsen, and E. C. Lupu, ‘Automated Dynamic Analysis of Ransomware: Benefits, Limitations and use for Detection’, no. September, 2016.Google Scholar
  33. 33.
    A. Kharaz, S. Arshad, C. Mulliner, W. Robertson, and E. Kirda, ‘UNVEIL: A Large-Scale, Automated Approach to Detecting Ransomware’, Usenix Secur., pp. 757–772, 2016.Google Scholar
  34. 34.
    K. Cabaj, P. Gawkowski, K. Grochowski, and D. Osojca, ‘Network activity analysis of CryptoWall ransomware’, pp. 91–11, 2015.Google Scholar
  35. 35.
    J. Baldwin, O. M. K. Alhawi, and A. Dehghantanha, ‘Leveraging Machine Learning Techniques for Windows Ransomware Network Traffic Detection’, 2017.Google Scholar
  36. 36.
    M. M. Ahmadian and H. R. Shahriari, ‘2entFOX: A framework for high survivable ransomwares detection’, 2016 13th Int. Iran. Soc. Cryptol. Conf. Inf. Secur. Cryptol., pp. 79–84, 2016.Google Scholar
  37. 37.
    S. Homayoun, A. Dehghantanha, M. Ahmadzadeh, S. Hashemi, and R. Khayami, ‘Know Abnormal, Find Evil: Frequent Pattern Mining for Ransomware Threat Hunting and Intelligence’, IEEE Trans. Emerg. Top. Comput., vol. 6750, no. c, pp. 1–1, 2017.Google Scholar
  38. 38.
    K. K. R. Azmoodeh, Amin; Dehghantanha, Ali; Conti, Mauro; Choo, ‘Detecting Crypto Ransomware in IoT Networks Based On Energy Consumption Footprint’, J. Ambient Intell. Humaniz. Comput., vol. 0, no. 0, p. 0, 2017.Google Scholar
  39. 39.
    Ransomware Tracker, ‘Tracker | Ransomware Tracker’, 2016. [Online]. Available: [Accessed: 04-Jan-2017].
  40. 40.
    VirusTotal, ‘Free Online Virus, Malware and URL Scanner’. 2014.Google Scholar
  41. 41., ‘Portable software for USB, portable, and cloud drives’, 2017. [Online]. Available: [Accessed: 06-Sep-2017].
  42. 42.
    C. Rossow et al., ‘Prudent practices for designing malware experiments: Status quo and outlook’, Proc. - IEEE Symp. Secur. Priv., no. May, pp. 65–79, 2012.Google Scholar
  43. 43.
    AV-TEST, ‘Test antivirus software for Windows 10 - June 2017 | AV-TEST’, 2017. [Online]. Available: [Accessed: 06-Sep-2017].
  44. 44., ‘InstructionCounter plugin for IDA Pro’, 2017. [Online]. Available: [Accessed: 06-Sep-2017].
  45. 45.
    Hex-Rays, ‘IDA Support: Evaluation Version’, 2017. [Online]. Available: [Accessed: 06-Sep-2017].
  46. 46.
    C.-J. L. Chih-Wei Hsu, Chih-Chung Chang, ‘A Practical Guide to Support Vector Classification’, BJU Int., vol. 101, no. 1, pp. 1396–400, 2008.Google Scholar
  47. 47.
    University of Waikato, ‘Weka 3 - Data Mining with Open Source Machine Learning Software in Java’, 2016. [Online]. Available: [Accessed: 31-Mar-2017].
  48. 48.
    V. N. Vapnik, ‘The Nature of Statistical Learning Theory’, Springer, vol. 8. p. 188, 1995.Google Scholar
  49. 49.
    D. T. Larose, Discovering knowledge in data: an introduction to data mining, vol. 1st. 2005.Google Scholar
  50. 50.
    A. G. Karegowda, A. S. Manjunath, and M. A. Jayaram, ‘Comparative Study of Attribute Selection Using Gain Ratio and Correlation Based Feature Selection’, Int. J. Inf. Technol. Knowl. Manag., vol. 2, no. 2, pp. 271–277, 2010.Google Scholar
  51. 51.
    I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques. 2005.Google Scholar
  52. 52.
    R. R. Bouckaert et al., ‘WEKA Manual for Version 3-8-1’, Univ. Waikato, p. 341, 2016.Google Scholar
  53. 53.
    Lenny Zeltser, ‘Using VMware for Malware Analysis’. [Online]. Available: [Accessed: 26-Sep-2017].
  54. 54.
    I. H. Witten, E. Frank, and M. a Hall, Data Mining: Practical Machine Learning Tools and Techniques (Google eBook). 2011.Google Scholar
  55. 55.
    X. Xu and X. Wang, ‘An Adaptive Network Intrusion Detection Method Based on PCA and Support Vector Machines’, in Advanced Data Mining and Applications, 2005, pp. 696–703.Google Scholar
  56. 56.
    F. Cloutier, ‘x86 Instruction Set Reference’, 2014. [Online]. Available: [Accessed: 21-Sep-2017].
  57. 57.
    Sergei Shevchenko and Adrian Nish, ‘BAE Systems Threat Research Blog: WanaCrypt0r Ransomworm’, 2017. [Online]. Available: [Accessed: 02-Oct-2017].

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computing, Science and EngineeringUniversity of SalfordManchesterUK
  2. 2.Department of Computer ScienceUniversity of SheffieldSheffieldUK

Personalised recommendations