Malware Detection in Big Data Using Fast Pattern Matching: A Hadoop Based Comparison on GPU

  • Chhabi Rani Panigrahi
  • Mayank Tiwari
  • Bibudhendu Pati
  • Rajendra Prasath
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8891)


In big data environment, hadoop stores the data in distributed file systems called hadoop distributed file system and process the data using parallel approach. When the cloud users store unstructured data in cloud storage, it becomes very important for cloud providers to secure those data. To provide malware security, cloud service providers should scan the whole contents of the database, which is a very time intensive job. It may even take days to complete the tasks. The main aim of the proposed work is to reduce the processing time by introducing Graphics Processing Unit (GPU) in hadoop cluster. The proposed work integrates two text pattern matching algorithms with the map-reduce programming model for faster detection of malware in big data. The results of our study indicate that use of GPU decreases the processing time of text pattern matching algorithms in big data hadoop.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aho, A.V., Corasick, M.J.: Efficient string matching: An aid to bibliographic search. Communications of the ACM 18, 333–340 (1975)CrossRefzbMATHMathSciNetGoogle Scholar
  2. 2.
    Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Communications of the ACM 20 (1977)Google Scholar
  3. 3.
    Wu, S., Manber, U.: A fast algorithm for multi-pattern searching, Univ. Arizona, Tucson, Report TR 94–17 (1994)Google Scholar
  4. 4.
    Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 422–426 (1970)CrossRefzbMATHGoogle Scholar
  5. 5.
    ClamAV project: Clamav virus database, (last accessed: August 15, 2014)
  6. 6.
    Kojm, T.: Clam-av, (last accessed: August 15, 2014)
  7. 7.
    Christodorescu, M., Jha, S., Seshia, S., Song, D., Bryant, R.: Semantics-aware malware detection. In: 2005 IEEE Symposium Security and Privacy (2005)Google Scholar
  8. 8.
    Dai, S.Y., Kuo, S.Y.: Mapmoon: A host-based malware detection tool. In: Proceedings of the 13th Pacific Rim International Symposium, pp. 349–356. IEEE Computer Society Press (2007)Google Scholar
  9. 9.
    Brumley, D., Hartwig, C., Kang, M.G., Liang, Z., Newsome, J., Poosankam, P., Song, D., Yin, H.: Automatically identifying trigger- based behavior in malware. Botnet Detection 36, 65–88 (2008)CrossRefGoogle Scholar
  10. 10.
    Xu, B., Zhou, X., Li, J.: Recursive shift indexing: a fast multi-pattern string matching algorithm. In: Proc. of the 4th International Conference on Applied Cryptography and Network Security (ACNS), pp. 64–73. IEEE Computer Society Press (2006)Google Scholar
  11. 11.
    Fisk, M., Varghese, G.: An analysis of fast string matching applied to content-based forwarding and intrusion detection, Technical Report CS2001-0670, University of California San Diegoy (2002)Google Scholar
  12. 12.
    Wikipedia: Map-reduce programming, wikispace, (last Accessed: August 15, 2014)

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Chhabi Rani Panigrahi
    • 1
  • Mayank Tiwari
    • 1
  • Bibudhendu Pati
    • 2
  • Rajendra Prasath
    • 3
  1. 1.Dept. of Information TechnologyC.V. Raman College of EngineeringBhubaneswarIndia
  2. 2.Dept. of Computer Science and EngineeringC.V. Raman College of EngineeringBhubaneswarIndia
  3. 3.Business Information SystemsUniversity College CorkCorkIreland

Personalised recommendations