Skip to main content

Big data in cybersecurity: a survey of applications and future trends


With over 4.57 billion people using the Internet in 2020, the amount of data being generated has exceeded 2.5 quintillion bytes per day. This rapid increase in the generation of data has pushed the applications of big data to new heights; one of which is cybersecurity. The paper aims to introduce a thorough survey on the use of big data analytics in building, improving, or defying cybersecurity systems. This paper surveys state-of-the-art research in different areas of applications of big data in cybersecurity. The paper categorizes applications into areas of intrusion and anomaly detection, spamming and spoofing detection, malware and ransomware detection, code security, cloud security, along with another category surveying other directions of research in big data and cybersecurity. The paper concludes with pointing to possible future directions in research on big data applications in cybersecurity.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. 20 ransomware statistics youre powerless to resist reading - hashed out by the ssl store. Accessed on 01 Aug 2020

  2. 2019 cyber security statistics trends & data: The ultimate list of cyber security stats—purplesec. Accessed on 30 Jul 2020

  3. 2020 trustwave global security report—trustwave. Accessed on 01 Aug 2020

  4. 5 cybersecurity threats to be aware of in 2020—ieee computer society. Accessed on 30 Jul 2020

  5. Apple reveals windows 10 is four times more popular than the mac. howpublished, Accessed on 3 Dec 2018

  6. Computer science. Accessed on 30 Jul 2020

  7. Cyberthreat trends: 15 cybersecurity threats for 2020—nortonlifelock. Accessed on 30 Jul 2020

  8. Github—mozilla/openwpm: A web privacy measurement framework. Accessed on 23 Mar 2019

  9. Global digital population as of april 2020. Accessed: 13 May 2020

  10. Global\_2020\_forecast\_highlights. Accessed on 30 Jul 2020

  11. Half of the malware detected in 2019 was classified as zero-day threats, making it the most common malware to date - cynet. Accessed on 30 Jul 2020

  12. How much data do we create every day? Accessed 22 Oct 2018

  13. The iot rundown for 2020: Stats, risks, and solutions—security today. Accessed on 30 Jul 2020

  14. Malware statistics youd better get your computer vaccinated. Accessed 29 May 2020

  15. Microsoft vulnerabilities more than doubled in 2017. howpublished, Accessed 3 Dec 2018

  16. Top cybersecurity threats in 2020. Accessed on 30 Jul 2020

  17. Twitter hack: Us and uk teens arrested over breach of celebrity accounts—twitter—the guardian. Accessed on 01 Aug 2020

  18. What is big data analytics, howpublished. Accessed 24 Nov 2018

  19. Ransomware cyber attacks: Which industries are being hit the hardest? Accessed on 08 Dec 2018

  20. Us hospital pays $55,000 to hackers after ransomware attack—zdnet. Accessed on 08 Dec 2018

  21. Abdlhamed M, Kifayat K, Shi Q, Hurst W (2017) Intrusion prediction systems. Information fusion for cyber-security analytics. Springer, New York, pp 155–174

    Chapter  Google Scholar 

  22. Abraham S, Nair S (2015) Predictive cyber-security analytics framework: a non-homogenous markov model for security quantification. arXiv:1501.01901

  23. Aditham S, Ranganathan N (2018) A system architecture for the detection of insider attacks in big data systems. IEEE Trans Depend Secure Comput 15(6):974–987

    Article  Google Scholar 

  24. Alani MM (2016) What is the cloud? Elements of cloud computing security. Springer, New York, pp 1–14

    Google Scholar 

  25. AlEroud A, Karabatis G (2017) Using contextual information to identify cyber-attacks. Information fusion for cyber-security analytics. Springer, New York, pp 1–16

    Google Scholar 

  26. Aleroud A, Zhou L (2017) Phishing environments, techniques, and countermeasures: a survey. Comput Secur 68:160–196

    Article  Google Scholar 

  27. Alguliyev, R., Imamverdiyev, Y.: Big data: big promises for information security. In: Application of Information and Communication Technologies (AICT), 2014 IEEE 8th international conference on. IEEE, pp 1–4

  28. Alhuzali A, Gjomemo R, Eshete B, Venkatakrishnan V (2018) \(\{\)NAVEX\(\}\): Precise and scalable exploit generation for dynamic web applications. In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp 377–392

  29. Alrabaee S, Shirani P, Wang L, Debbabi M (2018) Fossil: a resilient and efficient system for identifying foss functions in malware binaries. ACM Trans Priv Secur 21(2):8

    Article  Google Scholar 

  30. Alsadhan AA, Hussain A, Alani MM (2018) Detecting ndp distributed denial of service attacks using machine learning algorithm based on flow-based representation. In: 2018 11th International Conference on Developments in eSystems Engineering (DeSE). IEEE, pp 134–140

  31. Amini L, Christodorescu M, Cohen MA, Parthasarathy S, Rao J, Sailer R, Schales DL, Venema WZ, Verscheure O (2015) Adaptive cyber-security analytics. US Patent 9,032,521

  32. Baikalov IA, Froelich C, McConnell T, McGloughlin JP et al (2016) Cyber security analytics architecture. US Patent 9,516,041

  33. Balaban D (2020) 11 types of spoofing attacks every security professional should know about—2020-03-24—security magazine. Accessed on 01 Aug 2020

  34. Banescu S, Collberg C, Pretschner A (2017) Predicting the resilience of obfuscated code against symbolic execution attacks via machine learning. In: 26th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 17), pp 661–678

  35. Barradas D, Santos N, Rodrigues L (2018) Effective detection of multimedia protocol tunneling using machine learning. In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp 169–185

  36. Biham E, Shamir A (1991) Differential cryptanalysis of des-like cryptosystems. J Cryptol 4(1):3–72

    Article  MathSciNet  MATH  Google Scholar 

  37. Bilge L, Han Y, Dell’Amico M (2017) Riskteller: Predicting the risk of cyber incidents. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, pp 1299–1311

  38. Cao P, Badger EC, Kalbarczyk ZT, Iyer RK, Withers A, Slagell AJ (2015) Towards an unified security testbed and security analytics framework. In: Proceedings of the 2015 symposium and bootcamp on the science of security. ACM

  39. Cao Y, Yang J (2015) Towards making systems forget with machine unlearning. In: 2015 IEEE symposium on security and privacy. IEEE, pp 463–480

  40. Chakraborty R, Vishik C, Rao HR (2013) Privacy preserving actions of older adults on social media: Exploring the behavior of opting out of information sharing. Decis Support Syst 55(4):948–956

    Article  Google Scholar 

  41. Chiew KL, Yong KSC, Tan CL (2018) A survey of phishing attacks: Their types, vectors and technical approaches. Expert Syst Appl 106:1–20

    Article  Google Scholar 

  42. Cinque M, Della Corte R, Pecchia A (2019) Microservices monitoring with event logs and black box execution tracing. IEEE Trans Serv Comput

  43. Cinque M, Della Corte R, Pecchia A (2020) Contextual filtering and prioritization of computer application logs for security situational awareness. Future Gener Comput Syst 111:668–680

    Article  Google Scholar 

  44. Curtin M, Dolske J (1998) A brute force search of des keyspace. In: 8th Usenix Symposium, January. Citeseer, pp 26–29

  45. Cuzzocrea A, Martinelli F, Mercaldo F, Grasso GM (2018) Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments. J Reliab Intell Environ 4(4):225–245

    Article  Google Scholar 

  46. DATA G (2018) Malware in 2018: the danger is on the web—g data blog. Accessed on 31 Mar 2019

  47. Dias LF, Correia M (2020) Big data analytics for intrusion detection: an overview. In: Handbook of research on machine and deep learning applications for cyber security. IGI Global, pp 292–316

  48. Du M, Li F, Zheng G, Srikumar V (2017) Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1285–1298

  49. Englehardt S, Narayanan A (2016) Online tracking: A 1-million-site measurement and analysis. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, pp 1388–1401

  50. Fang W, Wen XZ, Zheng Y, Zhou M (2017) A survey of big data security and privacy preserving. IETE Tech Rev 34(5):544–560

    Article  Google Scholar 

  51. Farris KA, Shah A, Cybenko G, Ganesan R, Jajodia S (2018) Vulcon: A system for vulnerability prioritization, mitigation, and management. ACM Trans Priv Secur 21(4):16:1–16:28.

  52. Feng Q, Zhou R, Xu C, Cheng Y, Testa B, Yin H (2016) Scalable graph-based bug search for firmware images. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 480–491.

  53. Funk C, Garnaeva M (2013) Kaspersky security bulletin 2013. overall statistics for 2013, vol 10. Kaspersky Lab

  54. Gai K, Qiu M, Elnagdy SA (2016) A novel secure big data cyber incident analytics framework for cloud-based cybersecurity insurance. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE international conference on intelligent data and security (IDS). IEEE, pp 171–176

  55. Gandomi A, Haider M (2015) Beyond the hype: Big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144

    Article  Google Scholar 

  56. García S, Ramírez-Gallego S, Luengo J, Benítez JM, Herrera F (2016) Big data preprocessing: methods and prospects. Big Data Anal 1(1):9

    Article  Google Scholar 

  57. Gong NZ, Liu B (2018) Attribute inference attacks in online social networks. ACM Trans Priv Secur 21(1):3

    Article  Google Scholar 

  58. Gou L, Zhou MX, Yang H (2014) Knowme and shareme: understanding automatically discovered personality traits from social media and user sharing preferences. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 955–964

  59. Grahn K, Westerlund M, Pulkkis G (2017) Analytics for network security: a survey and taxonomy. Information fusion for cyber-security analytics. Springer, New York, pp 175–193

    Chapter  Google Scholar 

  60. Guo W, Mu D, Xu J, Su P, Wang G, Xing X (2018) Lemna: explaining deep learning based security applications. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, CCS ’18. ACM, New York, pp 364–379.

  61. Gutierrez CN, Kim T, Della Corte R, Avery J, Goldwasser D, Cinque M, Bagchi S (2018) Learning from the ones that got away: Detecting new forms of phishing attacks. IEEE Trans Depend Secure Comput 15(6):988–1001

    Article  Google Scholar 

  62. Gyöngyi Z, Garcia-Molina H, Pedersen J (2004) Combating web spam with trustrank. In: Proceedings of the Thirtieth international conference on Very large data bases, vol 30. VLDB Endowment, pp 576–587

  63. Hale B (2016) Estimating log generation for security information event and log management. Retrieved Sep 15

  64. He P, Zhu J, He S, Li J, Lyu MR (2018) Towards automated log parsing for large-scale log data analysis. IEEE Trans Depend Secure Comput 15(6):931–944

    Article  Google Scholar 

  65. Hong JB, Nhlabatsi A, Kim DS, Hussein A, Fetais N, Khan KM (2019) Systematic identification of threats in the cloud: a survey. Comput Netw 150:46–69

    Article  Google Scholar 

  66. Hossain MN, Wang J, Weisse O, Sekar R, Genkin D, He B, Stoller SD, Fang G, Piessens F, Downing E, et al (2018) Dependence-preserving data compaction for scalable forensic analysis. In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp 1723–1740

  67. Huang DY, Aliapoulios MM, Li VG, Invernizzi L, Bursztein E, McRoberts K, Levin J, Levchenko K, Snoeren AC, McCoy D (2018) Tracking ransomware end-to-end. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 618–631

  68. Ikram M, Onwuzurike L, Farooqi S, Cristofaro ED, Friedman A, Jourjon G, Kaafar MA, Shafiq MZ (2017) Measuring, characterizing, and detecting facebook like farms. ACM Trans Priv Secur 20(4):13

    Article  Google Scholar 

  69. Jansen K, Schäfer M, Moser D, Lenders V, Pöpper C, Schmitt J (2018) Crowd-gps-sec: Leveraging crowdsourcing to detect and localize gps spoofing attacks. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 1018–1031

  70. John NA (2013) The social logics of sharing. Commun Rev 16(3):113–131

    Article  Google Scholar 

  71. Johnstone M, Peacock M (2020) Seven pitfalls of using data science in cybersecurity. In: Data Science in Cybersecurity and Cyberthreat Intelligence. Springer, Nwe York

  72. Jurgens D (2013) That’s what friends are for: Inferring location in online social media platforms based on social relationships. In: Seventh International AAAI conference on weblogs and social media

  73. Kelsey J, Schneier B, Wagner D (1996) Key-schedule cryptanalysis of idea, g-des, gost, safer, and triple-des. Annual International Cryptology Conference. Springer, New York, pp 237–251

    Google Scholar 

  74. Khan MUK, Park HS, Kyung CM (2019) Rejecting motion outliers for efficient crowd anomaly detection. IEEE Trans Inf Forensics Secur 14(2):541–556

    Article  Google Scholar 

  75. Kim D, Kwon BJ, Kozák K, Gates C, Dumitras T (2018) The broken shield: Measuring revocation effectiveness in the windows code-signing \(\{\)PKI\(\}\). In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp 851–868

  76. Kim S, Woo S, Lee H, Oh H (2017) Vuddy: A scalable approach for vulnerable code clone discovery. In: 2017 IEEE symposium on security and privacy (SP). IEEE, pp 595–614

  77. Koli J (2018) Randroid: Android malware detection using random machine learning classifiers. In: 2018 Technologies for smart-city energy security and power (ICSESP). IEEE, pp 1–6

  78. Kotenko I, Saenko I, Branitskiy A (2020) Machine learning and big data processing for cybersecurity data analysis. Data science in cybersecurity and cyberthreat intelligence. Springer, New York, pp 61–85

    Chapter  Google Scholar 

  79. Kumar R, Goyal R (2019) On cloud security requirements, threats, vulnerabilities and countermeasures: a survey. Comput Sci Rev 33:1–48

    Article  MathSciNet  Google Scholar 

  80. Kwon BJ, Mondal J, Jang J, Bilge L, Dumitraş T (2015) The dropper effect: insights into malware distribution with downloader graph analytics. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1118–1129

  81. Laney D (2001) 3d data management: controlling data volume, velocity and variety. META Group Res Note 6(70):1

    Google Scholar 

  82. Li H, Xu X, Liu C, Ren T, Wu K, Cao X, Zhang W, Yu Y, Song D (2018) A machine learning approach to prevent malicious calls over telephony networks. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 53–69

  83. Liao X, Yuan K, Wang X, Li Z, Xing L, Beyah R (2016) Acing the ioc game: Toward automatic discovery and analysis of open-source cyber threat intelligence. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 755–766

  84. Liao X, Yuan K, Wang X, Pei Z, Yang H, Chen J, Duan H, Du K, Alowaisheq E, Alrwais S, et al (2016) Seeking nonsense, looking for trouble: Efficient promotional-infection detection through semantic inconsistency search. In: 2016 IEEE symposium on security and privacy (SP). IEEE, pp 707–723

  85. MacDonald N (2012) Information security is becoming a big data analytics problem. Accessed on 13 May 2020

  86. Madi T, Jarraya Y, Alimohammadifar A, Majumdar S, Wang Y, Pourzandi M, Wang L, Debbabi M (2018) Isotop: auditing virtual networks isolation across cloud layers in openstack. ACM Trans Priv Secur 22(1):1

    Article  Google Scholar 

  87. Mahmood T, Afzal U (2013) Security analytics: big data analytics for cybersecurity: a review of trends, techniques and tools. In: Information assurance (ncia), 2013 2nd national conference on. IEEE, pp 129–134

  88. Majumdar S, Tabiban A, Jarraya Y, Oqaily M, Alimohammadifar A, Pourzandi M, Wang L, Debbabi M (2018) Learning probabilistic dependencies among events for proactive security auditing in clouds. J Comput Secur:1–38 (preprint)

  89. Maltby D (2011) Big data analytics. In: 74th Annual Meeting of the Association for Information Science and Technology (ASIST), pp 1–6

  90. Martha V (2015) Big data processing algorithms. Big data. Springer, New York, pp 61–91

    Chapter  Google Scholar 

  91. Matsui M (1993) Linear cryptanalysis method for des cipher. Workshop on the Theory and Application of of Cryptographic Techniques. Springer, New York, pp 386–397

    Google Scholar 

  92. Nadgowda S, Isci C, Bal M (2018) Déjàvu: bringing black-box security analytics to cloud. In: Proceedings of the 19th International middleware conference industry. ACM, New York, pp 17–24

  93. Nilizadeh S, Labrèche F, Sedighian A, Zand A, Fernandez J, Kruegel C, Stringhini G, Vigna G (2017) Poised: Spotting twitter spam off the beaten paths. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1159–1174

  94. Oltisk J (2013) The big data security analytics era is here. Tech. rep, Enterprise Strategy Group

    Google Scholar 

  95. Pearce P, Ensafi R, Li F, Feamster N, Paxson V (2017) Augur: Internet-wide detection of connectivity disruptions. In: 2017 IEEE symposium on security and privacy (SP). IEEE, pp 427–443

  96. Pierazzi F, Casolari S, Colajanni M, Marchetti M (2016) Exploratory security analytics for anomaly detection. Comput Secur 56:28–49

    Article  Google Scholar 

  97. Reaves B, Vargas L, Scaife N, Tian D, Blue L, Traynor P, Butler KR (2018) Characterizing the security of the sms ecosystem with public gateways. ACM Trans Priv Secur 22(1):2

    Google Scholar 

  98. Richardson R, North MM (2017) Ransomware: evolution, mitigation and prevention. Int Manag Rev 13(1):10

    Google Scholar 

  99. Rieck K, Holz T, Willems C, Düssel P, Laskov P (2008) Learning and classification of malware behavior. International conference on detection of intrusions and malware, and vulnerability assessment. Springer, New York, pp 108–125

    Google Scholar 

  100. Rijmen V, Daemen J (2001) Advanced encryption standard. In: Proceedings of federal information processing standards publications. National Institute of Standards and Technology, pp 19–22

  101. Rose C (2011) The security implications of ubiquitous social media

  102. Salva S, Regainia L (2019) A catalogue associating security patterns and attack steps to design secure applications. J Comput Secur:1–26 (Preprint)

  103. Shen Y, Mariconti E, Vervier PA, Stringhini G (2018) Tiresias: predicting security events through deep learning. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 592–605

  104. Shu X, Araujo F, Schales DL, Stoecklin MP, Jang J, Huang H, Rao JR (2018) Threat intelligence computing. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1883–1898

  105. Shu X, Yao DD, Ramakrishnan N, Jaeger T (2017) Long-span program behavior modeling and attack detection. ACM Trans Priv Secur 20(4):12

    Article  Google Scholar 

  106. Siadati H, Memon N (2017) Detecting structurally anomalous logins within enterprise networks. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1273–1284

  107. Singer PW, Friedman A (2014) Cybersecurity: what everyone needs to know. Oxford University Press, Oxford

    Book  Google Scholar 

  108. Sipola T (2015) Knowledge discovery from network logs. Cyber security: analytics, technology and automation. Springer, New York, pp 195–203

    Chapter  Google Scholar 

  109. Siwicki B (2016) Ransomware attackers collect ransom from kansas hospital, dont unlock all the data, then demand more money. Healthcare IT News

  110. Standard DE et al (1977) Federal information processing standards publication 46. National Bureau of Standards, US Department of Commerce, vol 23

  111. Suciu O, Marginean R, Kaya Y, Daume III H, Dumitras T (2018) When does machine learning \(\{\)FAIL\(\}\)? generalized transferability for evasion and poisoning attacks. In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp 1299–1316

  112. Sun B, Takahashi T, Zhu L, Mori T (2020) Discovering malicious urls using machine learning techniques. Data science in cybersecurity and cyberthreat intelligence. Springer, New York, pp 33–60

    Chapter  Google Scholar 

  113. Talabis M, McPherson R, Miyamoto I, Martin J (2014) Information security analytics: finding security insights, patterns, and anomalies in big data. Syngress

  114. Tan Z, Nagar UT, He X, Nanda P, Liu RP, Wang S, Hu J (2014) Enhancing big data security with collaborative intrusion detection. IEEE Cloud Comput 1(3):27–33

    Article  Google Scholar 

  115. Tankard C (2012) Big data security. Netw Secur 2012(7):5–8

    Article  Google Scholar 

  116. Terzi DS, Terzi R, Sagiroglu S (2015) A survey on security and privacy issues in big data. In: 2015 10th international conference for internet technology and secured transactions (ICITST). IEEE, pp 202–207

  117. Thirumaran J et al (2018) Applications of big data analytics-network security. Int J Res Sci Eng Technol 5(1):55–59

    Google Scholar 

  118. Tipton H (2019) Information security management handbook, vol IV. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  119. Ugarte-Pedrero X, Graziano M, Balzarotti D (2019) A close look at a daily dataset of malware samples. ACM Trans Priv Secur 22(1):6

    Article  Google Scholar 

  120. Ullah F, Babar MA (2019) Architectural tactics for big data cybersecurity analytics systems: a review. J Syst Softw 151:81–118

    Article  Google Scholar 

  121. Von Solms R, Van Niekerk J (2013) From information security to cyber security. Comput Secur 38:97–102

    Article  Google Scholar 

  122. Wohlin C (2014) Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering, pp 1–10

  123. Xu Z, Wu Z, Li Z, Jee K, Rhee J, Xiao X, Xu F, Wang H, Jiang G (2016) High fidelity data reduction for big data security dependency analyses. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 504–516

  124. Yang X, Ma T, Shi Y (2007) Typical dos/ddos threats under ipv6. In: 2007 International multi-conference on computing in the global information technology (ICCGI’07). IEEE

  125. Yao Y, Viswanath B, Cryan J, Zheng H, Zhao BY (2017) Automated crowdturfing attacks and defenses in online review systems. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1143–1158

  126. You I, Yim K (2010) Malware obfuscation techniques: a brief survey. In: 2010 International conference on broadband, wireless computing, communication and applications. IEEE, pp 297–300

  127. Yuan Y, Adhatarao SS, Lin M, Yuan Y, Liu Z, Fu X (2020) Ada: Adaptive deep log anomaly detector. In: IEEE INFOCOM 2020-IEEE conference on computer communications. IEEE, pp 2449–2458

  128. Zhang D (2018) Big data security and privacy protection. In: 8th International conference on management and computer science (ICMCS 2018). Atlantis Press

  129. Zhang J, Zhang R, Zhang Y, Yan G (2016) The rise of social botnets: Attacks and countermeasures. IEEE Trans Depend Secure Comput

  130. Zhao JY, Kessler EG, Yu J, Jalal K, Cooper CA, Brewer JJ, Schwaitzberg SD, Guo WA (2018) Impact of trauma hospital ransomware attack on surgical residency training. J Surg Res 232:389–397

    Article  Google Scholar 

  131. Zhu Z, Dumitraş T (2016) Featuresmith: automatically engineering features for malware detection by mining the security literature. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp 767–778. ACM, New York

  132. Zoldi S, Athwal J, Li H, Kennel M, Xue X (2015) Cyber security adaptive analytics threat monitoring system and method. US Patent 9,191,403

  133. Zuech R, Khoshgoftaar TM, Wald R (2015) Intrusion detection and big heterogeneous data: a survey. J Big Data 2(1):3

    Article  Google Scholar 

  134. Zuo Y, Wu Y, Min G, Huang C, Pei K (2020) An intelligent anomaly detection scheme for micro-services architectures with temporal and spatial data analysis. IEEE Trans Cogn Commun Netw

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Mohammed M. Alani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Literature summary tables

Appendix A: Literature summary tables

See Tables 1, 2, 3, 4, 5 and 6.

Table 1 Summary of papers in intrusion and anomaly detection section
Table 2 Summary of papers in spamming, spoofing, and phishing detection section
Table 3 Summary of papers in malware and ransomeware section
Table 4 Summary of papers in code security section
Table 5 Summary of papers in cloud security section
Table 6 Summary of papers in other applications section

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alani, M.M. Big data in cybersecurity: a survey of applications and future trends. J Reliable Intell Environ 7, 85–114 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: