Skip to main content
Log in

Concurrent Information Retrieval System (IRS) for Large Volume of Data with Multiple Pattern Multiple (\(2^\mathrm{N}\)) Shaft Parallel String Matching

  • Published:
Annals of Data Science Aims and scope Submit manuscript

Abstract

The internet revolution has made the digital information easy to capture, process, store, distribute, and transmit. There is a significant development in computation and related technologies. In different walks of life, there is ever expanding usage of these technologies. As a result there is a continuous collection and storage of huge amount of data of diverse characteristics in data bases. It is indeed a challenge for the retrieval of information from this enormous amount of data. The information retrieval is an attempt to make sense of the information explanation embedded in this huge volume of data. All these aspects suggest the need of intelligent data retrieval methodologies for the retrieval of useful information. In this paper, as a concurrent information retrieval system (IRS) with multiple pattern multiple (\(2^\mathrm{N})\) shaft sequential and parallel string matching algorithms is proposed. The proposed approach concurrently retrieves the searching information from huge volume of data. Experimental results have shown that the proposed string matching algorithms reduced the search time very well in both the sequential and parallel environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Voorhees EM, Harman DK (2005) TREC: Experiment and evaluation in information retrieval, computational linguistics, vol 32(4). MIT Press, Cambridge

  2. Hyman H, Sincich T, Will R, Agrawal M, Padmanabhan B, Iii WF (2015) A process model for information retrieval context learning and knowledge discovery. Artif Intell Law 23(2):103–132

    Article  Google Scholar 

  3. Golub K, Soergel D, Buchanan G, Tudhope D, Lykke M, Hiom D (2016) A framework for evaluating automatic indexing or classification in the context of retrieval. J Assoc Inf Sci Technol 67(1):3–16

    Article  Google Scholar 

  4. Bhatia S, Majumdar D, Aggarwal N (2016) Proactive Information Retrieval: Anticipating Users. In: International conference on information need in advances in information retrieval, pp 874-877

  5. Roe C (2012) The growth of unstructured data: what to do with all those zettabytes

  6. Raju SV et al (2011) Recent Advancement is Parallel Algorithms for String matching on computing models—A survey and experimental results. Springer, LNCS, pp 270–278

  7. Charras C, Lecroq T (2004) Handbook of exact string matching algorithms. King’s College Publication, London

    Google Scholar 

  8. Mohideen ZA, Kaur K (2015) Digital Library: demands and expectations. In: International symposium on emerging trends and technologies in libraries and information services, pp 17–21

  9. Yang J, Liu J, Dai Q (2015) An improved Bag-of-Words framework for remote sensing image retrieval in large-scale image databases. Int J Digit Earth 8(4):273–292

    Article  Google Scholar 

  10. Kovalevskaya NV, Whicher C, Richardson TD, Smith C, Grajciarova J, Cardama X, Moreira J, Alexa A, McMurray AA, Nielsen FG (2016) DNA digest and repositive: connecting the world of genomic data. PLoS Biol 14(3):e1002418

    Article  Google Scholar 

  11. Santhosh B, Viswanath K (2016) Review on secured medical image processing. In: Information systems design and intelligent applications, pp 531–537

  12. Dugas M, Neuhaus P, Meidt A, Doods J, Storck M, Bruland P, Varghese J (2016) Portal of medical data models: information infrastructure for medical research and healthcare. Database 2016:1–9

    Article  Google Scholar 

  13. Barakat HJ, Shatnawi HA, Ismail ST (2016) The role of marketing information systems in reducing the affects of the international financial crisis: a study applied on the banks working in the Kingdom of Saudi Arabia from islamic perspective. Int J Mark Stud. 8(1):181–190

    Article  Google Scholar 

  14. Liu X, Shen HW (2016) Association analysis for visual exploration of multivariate scientific data sets. IEEE Trans Vis Comput Graph 22(1):955–964

    Article  Google Scholar 

  15. Patel P, Jena B, Sahoo B (2014) Knowledge discovery on web information repository. Int J Adv Comput Tech Appl 1(2):049–056

    Google Scholar 

  16. Mahmud Al A, Martens J-B (2016) Social networking through email: studying email usage patterns of persons with aphasia. Aphasiology 30(2):186–210

    Google Scholar 

  17. Zhang D, Guo Z, Gong Y (2016) Multispectral biometrics systems. In: Multispectral biometrics, pp 23–35

  18. Aho AV, Hopcroft JE (2013) Design & Analysis of computer algorithms. Pearson Education India, Gurgaon

    Google Scholar 

  19. Knuth DE, Morris JH, Pratt VR (1977) Fast pattern matching in strings. SIAM J Comput 6(2):323–350

    Article  Google Scholar 

  20. Boyer RS, Moore JS (1977) A fast string searching algorithm. Commun ACM 20(10):762–772

    Article  Google Scholar 

  21. Cantone D, Faro S (2003) Fast-Search: a new efficient variant of the Boyer-Moore string matching algorithm. In: International conference on experimental and efficient algorithms, pp 47–58

  22. Nebel M (2006) Fast string matching by using probabilities: on an optimal mismatch variant of Horspool’s algorithm. Theor Comput Sci 359(1):329–343

    Article  Google Scholar 

  23. Deusdado S, Carvalho P (2009) GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences. Int J Bioinform Res Appl 5(4):385–401

    Article  Google Scholar 

  24. Raju SV et al. (2011) PDM data classification from STEP- an object oriented String matching approach. In: IEEE conference on application of information and communication technologies, pp 1–9

  25. Rasool A, Khare N (2013) Performance improvement of BMH and BMHS using PDJ (possible double jump) and MValue (match value). Int J Comput Appl 72(1):1–6

    Google Scholar 

  26. Hoang L, Prasanna VK (2013) A memory-efficient and modular approach for large-scale string pattern matching. IEEE Trans Comput 62(5):844–857

    Article  Google Scholar 

  27. Kumar KSMV, Raju SV, Govardhan A (2013) Overlapped text partition algorithm for pattern matching on hypercube networked model. Glob J Comput Sci Technol 13(4):1–8

    Google Scholar 

  28. Kadhim HA, Rashidx NA (2014) Maximum-shift string matching algorithms. In: International conference on computer and information sciences, pp 1–6

  29. Carl B et al. (2015) Average-case optimal approximate circular string matching. In: International conference on language and automata theory and applications, pp 85–96

  30. Richa S et al (2015) Efficient parameterized string matching algorithm. Int J Emerg Res Manag Technol 4(5):188–193

    Google Scholar 

  31. Computing MC (2015) Parallel optimization of string mode matching algorithm based on multi-core computing. J Softw Eng 2(2):388–391

    Google Scholar 

  32. Butchi Raju K, Dr S, Raju V (2015) Numeral N-folded parallel string matching. Int J Appl Eng Res 10(19):40415–40433

    Google Scholar 

  33. Jiaxing Q, Zhang G, Fang Z, Liu J (2016) A parallel algorithm of string matching based on message passing interface for multicore processors. Int J Hybrid Inf Technol 9(3):31–38

    Article  Google Scholar 

  34. Le Dang N, Le D-N, Le VT (2016) A new multiple-pattern matching algorithm for the network intrusion detection system. IACSIT Int J Eng Technol 8(2):94–100

    Article  Google Scholar 

  35. Vakili S, Langlois JM, Boughzala B, Savaria Y (2016) Memory-efficient string matching for intrusion detection systems using a high-precision pattern grouping algorithm. In: Proceedings of the 2016 symposium on architectures for networking and communications systems, pp 37–42

  36. Aho AV, Corasick MJ (1975) Efficient string matching: an aid to bibliographic search. Commun ACM 18(6):333–340

    Article  Google Scholar 

  37. Wu S, Manber U (1994) A fast algorithm for multi-pattern searching. Technical Report, TR 94–17, PP 1–11

  38. Karp RM, Rabin MO (1987) Efficient randomized pattern-matching algorithms. IBM J Res Dev 31(2):249–260

    Article  Google Scholar 

  39. Yang J, Ding S (2012) An improved pattern matching algorithm based on BMHS. In: International symposium on distributed computing and applications to business, engineering & science, pp 441–445

  40. Lecroq T (1995) Experimental results on string matching algorithms. Software 25(7):727–765

    Google Scholar 

  41. Zhen C, Di W (2008) Improving Wu-Manber: a multi-pattern matching algorithm. In: International conference on networking, sensing and control (ICNSC), pp 812 –817

  42. Dai L (2009) An aggressive algorithm for multiple string matching. Inf Process Lett 109(11):553–559

    Article  Google Scholar 

  43. Chen X, Zhang B, Pan X, Wu Z (2009) Highconcurrence Wu-Manber multiple patterns matching algorithm. In: Proceedings of the international symposium on information Process, p 404

  44. Zhang B, Chen X, Ping L, Wu Z (2009) Address filtering based Wu-Manber multiple patterns matching algorithm. In: International workshop on computer science and engineering, pp 408–412

  45. Yoon-Ho S-W (2013) BLAST: B-LAyered bad-character SHIFT tables for high-speed pattern matching. J Inf Secur 7(3):195–202

    Article  Google Scholar 

  46. Lecroq T (2007) Fast exact string matching algorithms. Inf Process Lett 102:229–235

    Article  Google Scholar 

  47. Kharbutli M et al (2012) Function and data parallelization of Wu-Manber pattern matching for intrusion detection systems. Netw Protoc Algorithms 4(3):46–61

    Google Scholar 

  48. Carl B et al (2014) Fast algorithms for approximate circular string matching. Algorithms Mol Biol 1:9

    Google Scholar 

  49. Holt William M (2016) 1.1 Moore’s law: a path going forward. In: IEEE international solid-state circuits conference (ISSCC), pp 8–13

  50. Gallopoulos E, Philippe B, Sameh AH (2016) Parallel programming paradigms. In: Parallelism in matrix computations, pp 3–16

  51. Schildt H (2014) Java: The Complete Reference, 9th edn. Oracle Corporation, New York

    Google Scholar 

  52. http://www.ncbi.nlm.nih.gov

  53. Donkor ES, Dayie NT, Adiku TK (2014) Bioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA). J Bioinform Seq Anal 6(1):1–6

    Article  Google Scholar 

  54. Norrgard K (2008) Forensics, DNA fingerprinting, and CODIS. Nat Educ 1(1):35–41

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chinta Someswara Rao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rao, C.S., Viswanadha Raju, S. Concurrent Information Retrieval System (IRS) for Large Volume of Data with Multiple Pattern Multiple (\(2^\mathrm{N}\)) Shaft Parallel String Matching. Ann. Data. Sci. 3, 175–203 (2016). https://doi.org/10.1007/s40745-016-0080-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40745-016-0080-1

Keywords

Navigation