Why Current Statistical Approaches to Ransomware Detection Fail

Pont, Jamie; Arief, Budi; Hernandez-Castro, Julio

doi:10.1007/978-3-030-62974-8_12

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12472))

Included in the following conference series:

International Conference on Information Security

1067 Accesses
8 Citations
3 Altmetric

Abstract

The frequent use of basic statistical techniques to detect ransomware is a popular and intuitive strategy; statistical tests can be used to identify randomness, which in turn can indicate the presence of encryption and, by extension, a ransomware attack. However, common file formats such as images and compressed data can look random from the perspective of some of these tests. In this work, we investigate the current frequent use of statistical tests in the context of ransomware detection, primarily focusing on false positive rates. The main aim of our work is to show that the current over-dependence on simple statistical tests within anti-ransomware tools can cause serious issues with the reliability and consistency of ransomware detection in the form of frequent false classifications. We determined thresholds for five key statistics frequently used in detecting randomness, namely Shannon entropy, chi-square, arithmetic mean, Monte Carlo estimation for Pi and serial correlation coefficient. We obtained a large dataset of 84,327 files comprising of images, compressed data and encrypted data. We then tested these thresholds (taken from a variety of previous publications in the literature where possible) against our dataset, showing that the rate of false positives is far beyond what could be considered acceptable. False positive rates were often above 50% and even above 90% on several occasions. False negative rates were also generally between 5% and 20%, numbers which are also far too high. As a direct result of these experiments, we determine that relying on these simple statistical approaches is not good enough to detect ransomware attacks consistently. We instead recommend the exploration of higher-order statistics such as skewness and kurtosis for future ransomware detection techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Al-rimy, B.A.S., Maarof, M.A., Shaid, S.Z.M.: Ransomware threat success factors, taxonomy, and countermeasures: a survey and research directions. Comput. Secur. 74, 144–166 (2018)
Article Google Scholar
Constantin, L.: More targeted, sophisticated and costly: Why ransomware might be your biggest threat (February 2020). https://www.csoonline.com/article/3518864/more-targeted-sophisticated-and-costly-why-ransomware-might-be-your-biggest-threat.html
Continella, A., et al.: Shieldfs: a self-healing, ransomware-aware filesystem. In: Proceedings of 32nd Annual Conference on Computer Security Applications, pp. 336–347 (2016)
Google Scholar
Digital Corpora: (2018). https://digitalcorpora.org
Esparza, J.M., Blueliv: spanish consultancy everis suffers bitpaymer ransomware attack: a brief analysis (November 2019). https://www.blueliv.com/cyber-security-and-cyber-threat-intelligence-blog-blueliv/research/everis-bitpaymer-ransomware-attack-analysis-dridex/
Pearson, F.R.S.K.: X. on the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. London Edinb. Dublin Philos. Mag. J. Sci. 50(302), 157–175 (1900). https://doi.org/10.1080/14786440009463897
Article MATH Google Scholar
Genç, Z.A., Lenzini, G., Ryan, P.Y.A.: Next generation cryptographic ransomware. In: Gruschka, N. (ed.) NordSec 2018. LNCS, vol. 11252, pp. 385–401. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03638-6_24
Chapter Google Scholar
Hunter, J.D.: Matplotlib: a 2d graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007). https://doi.org/10.1109/MCSE.2007.55
Article Google Scholar
Hurley-Smith, D., Patsakis, C., Hernandez-Castro, J.: On the unbearable lightness of FIPS 140–2 randomness tests. IEEE Trans. Inf. Forensics Secur. (2020)
Google Scholar
Kharraz, A., Arshad, S., Mulliner, C., Robertson, W., Kirda, E.: UNVEIL: a large-scale, automated approach to detecting ransomware. In: 25th USENIX Security Symposium (USENIX Security 16), pp. 757–772 (2016)
Google Scholar
Kharraz, A., Kirda, E.: Redemption: real-time protection against ransomware at end-hosts. In: Dacier, M., Bailey, M., Polychronakis, M., Antonakakis, M. (eds.) RAID 2017. LNCS, vol. 10453, pp. 98–119. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66332-6_5
Chapter Google Scholar
Kharraz, A., Robertson, W., Balzarotti, D., Bilge, L., Kirda, E.: Cutting the gordian knot: a look under the hood of ransomware attacks. In: Almgren, M., Gulisano, V., Maggi, F. (eds.) DIMVA 2015. LNCS, vol. 9148, pp. 3–24. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20550-2_1
Chapter Google Scholar
Mbol, F., Robert, J.-M., Sadighian, A.: An efficient approach to detect TorrentLocker ransomware in computer systems. In: Foresti, S., Persiano, G. (eds.) CANS 2016. LNCS, vol. 10052, pp. 532–541. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48965-0_32
Chapter Google Scholar
McIntosh, T., Jang-Jaccard, J., Watters, P., Susnjak, T.: The inadequacy of entropy-based ransomware detection. In: Gedeon, T., Wong, K.W., Lee, M. (eds.) ICONIP 2019. CCIS, vol. 1143, pp. 181–189. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36802-9_20
Chapter Google Scholar
Mehnaz, S., Mudgerikar, A., Bertino, E.: RWGuard: a real-time detection system against cryptographic ransomware. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) RAID 2018. LNCS, vol. 11050, pp. 114–136. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00470-5_6
Chapter Google Scholar
Micro, T.: Ransomware (September 2016). https://www.trendmicro.com/vinfo/us/security/definition/ransomware
Microsoft: kernel-mode driver architecture design guide (June 2017). https://docs.microsoft.com/en-gb/windows-hardware/drivers/kernel/
OpenSSL Software Foundation: Openssl. https://www.openssl.org
Palisse, A., Durand, A., Le Bouder, H., Le Guernic, C., Lanet, J.-L.: Data aware defense (DaD): towards a generic and practical ransomware countermeasure. In: Lipmaa, H., Mitrokotsa, A., Matulevičius, R. (eds.) NordSec 2017. LNCS, vol. 10674, pp. 192–208. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70290-2_12
Chapter Google Scholar
Pont, J., Abu Oun, O., Brierley, C., Arief, B., Hernandez-Castro, J.: A roadmap for improving the impact of anti-ransomware research. In: Askarov, A., Hansen, R.R., Rafnsson, W. (eds.) NordSec 2019. LNCS, vol. 11875, pp. 137–154. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35055-0_9
Chapter Google Scholar
Scaife, N., Carter, H., Traynor, P., Butler, K.R.: Cryptolock (and drop it): stopping ransomware attacks on user data. In: 36th International Conference on Distributed Computing Systems (ICDCS), pp. 303–312. IEEE (2016)
Google Scholar
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Article MathSciNet Google Scholar
Stat Trek: Chi-square test for independence (2020). https://stattrek.com/chi-square-test/independence.aspx
Tidy, J.: How a ransomware attack cost one firm £45m (June 2019). https://www.bbc.co.uk/news/business-48661152
W3Techs: Usage statistics of webp for websites (2020). https://w3techs.com/technologies/details/im-webp
Walker, J.: Ent (2008). https://www.fourmilab.ch/random/

Download references

Author information

Authors and Affiliations

University of Kent, Canterbury, UK
Jamie Pont, Budi Arief & Julio Hernandez-Castro

Authors

Jamie Pont
View author publications
You can also search for this author in PubMed Google Scholar
Budi Arief
View author publications
You can also search for this author in PubMed Google Scholar
Julio Hernandez-Castro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jamie Pont .

Editor information

Editors and Affiliations

Institute of Cybersecurity and Cryptology, University of Wollongong, Wollongong, NSW, Australia
Willy Susilo
Singapore Management University, Singapore, Singapore
Robert H. Deng
Institute of Cybersecurity and Cryptology, University of Wollongong, Wollongong, NSW, Australia
Fuchun Guo
Institute of Cybersecurity and Cryptology, University of Wollongong, Wollongong, NSW, Australia
Yannan Li
Informatics, Petra Christian University, Surabaya, Indonesia
Rolly Intan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pont, J., Arief, B., Hernandez-Castro, J. (2020). Why Current Statistical Approaches to Ransomware Detection Fail. In: Susilo, W., Deng, R.H., Guo, F., Li, Y., Intan, R. (eds) Information Security. ISC 2020. Lecture Notes in Computer Science(), vol 12472. Springer, Cham. https://doi.org/10.1007/978-3-030-62974-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-62974-8_12
Published: 25 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62973-1
Online ISBN: 978-3-030-62974-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics