Abstract
The internet revolution has made the digital information easy to capture, process, store, distribute, and transmit. There is a significant development in computation and related technologies. In different walks of life, there is ever expanding usage of these technologies. As a result there is a continuous collection and storage of huge amount of data of diverse characteristics in data bases. It is indeed a challenge for the retrieval of information from this enormous amount of data. The information retrieval is an attempt to make sense of the information explanation embedded in this huge volume of data. All these aspects suggest the need of intelligent data retrieval methodologies for the retrieval of useful information. In this paper, as a concurrent information retrieval system (IRS) with multiple pattern multiple (\(2^\mathrm{N})\) shaft sequential and parallel string matching algorithms is proposed. The proposed approach concurrently retrieves the searching information from huge volume of data. Experimental results have shown that the proposed string matching algorithms reduced the search time very well in both the sequential and parallel environments.
Similar content being viewed by others
References
Voorhees EM, Harman DK (2005) TREC: Experiment and evaluation in information retrieval, computational linguistics, vol 32(4). MIT Press, Cambridge
Hyman H, Sincich T, Will R, Agrawal M, Padmanabhan B, Iii WF (2015) A process model for information retrieval context learning and knowledge discovery. Artif Intell Law 23(2):103–132
Golub K, Soergel D, Buchanan G, Tudhope D, Lykke M, Hiom D (2016) A framework for evaluating automatic indexing or classification in the context of retrieval. J Assoc Inf Sci Technol 67(1):3–16
Bhatia S, Majumdar D, Aggarwal N (2016) Proactive Information Retrieval: Anticipating Users. In: International conference on information need in advances in information retrieval, pp 874-877
Roe C (2012) The growth of unstructured data: what to do with all those zettabytes
Raju SV et al (2011) Recent Advancement is Parallel Algorithms for String matching on computing models—A survey and experimental results. Springer, LNCS, pp 270–278
Charras C, Lecroq T (2004) Handbook of exact string matching algorithms. King’s College Publication, London
Mohideen ZA, Kaur K (2015) Digital Library: demands and expectations. In: International symposium on emerging trends and technologies in libraries and information services, pp 17–21
Yang J, Liu J, Dai Q (2015) An improved Bag-of-Words framework for remote sensing image retrieval in large-scale image databases. Int J Digit Earth 8(4):273–292
Kovalevskaya NV, Whicher C, Richardson TD, Smith C, Grajciarova J, Cardama X, Moreira J, Alexa A, McMurray AA, Nielsen FG (2016) DNA digest and repositive: connecting the world of genomic data. PLoS Biol 14(3):e1002418
Santhosh B, Viswanath K (2016) Review on secured medical image processing. In: Information systems design and intelligent applications, pp 531–537
Dugas M, Neuhaus P, Meidt A, Doods J, Storck M, Bruland P, Varghese J (2016) Portal of medical data models: information infrastructure for medical research and healthcare. Database 2016:1–9
Barakat HJ, Shatnawi HA, Ismail ST (2016) The role of marketing information systems in reducing the affects of the international financial crisis: a study applied on the banks working in the Kingdom of Saudi Arabia from islamic perspective. Int J Mark Stud. 8(1):181–190
Liu X, Shen HW (2016) Association analysis for visual exploration of multivariate scientific data sets. IEEE Trans Vis Comput Graph 22(1):955–964
Patel P, Jena B, Sahoo B (2014) Knowledge discovery on web information repository. Int J Adv Comput Tech Appl 1(2):049–056
Mahmud Al A, Martens J-B (2016) Social networking through email: studying email usage patterns of persons with aphasia. Aphasiology 30(2):186–210
Zhang D, Guo Z, Gong Y (2016) Multispectral biometrics systems. In: Multispectral biometrics, pp 23–35
Aho AV, Hopcroft JE (2013) Design & Analysis of computer algorithms. Pearson Education India, Gurgaon
Knuth DE, Morris JH, Pratt VR (1977) Fast pattern matching in strings. SIAM J Comput 6(2):323–350
Boyer RS, Moore JS (1977) A fast string searching algorithm. Commun ACM 20(10):762–772
Cantone D, Faro S (2003) Fast-Search: a new efficient variant of the Boyer-Moore string matching algorithm. In: International conference on experimental and efficient algorithms, pp 47–58
Nebel M (2006) Fast string matching by using probabilities: on an optimal mismatch variant of Horspool’s algorithm. Theor Comput Sci 359(1):329–343
Deusdado S, Carvalho P (2009) GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences. Int J Bioinform Res Appl 5(4):385–401
Raju SV et al. (2011) PDM data classification from STEP- an object oriented String matching approach. In: IEEE conference on application of information and communication technologies, pp 1–9
Rasool A, Khare N (2013) Performance improvement of BMH and BMHS using PDJ (possible double jump) and MValue (match value). Int J Comput Appl 72(1):1–6
Hoang L, Prasanna VK (2013) A memory-efficient and modular approach for large-scale string pattern matching. IEEE Trans Comput 62(5):844–857
Kumar KSMV, Raju SV, Govardhan A (2013) Overlapped text partition algorithm for pattern matching on hypercube networked model. Glob J Comput Sci Technol 13(4):1–8
Kadhim HA, Rashidx NA (2014) Maximum-shift string matching algorithms. In: International conference on computer and information sciences, pp 1–6
Carl B et al. (2015) Average-case optimal approximate circular string matching. In: International conference on language and automata theory and applications, pp 85–96
Richa S et al (2015) Efficient parameterized string matching algorithm. Int J Emerg Res Manag Technol 4(5):188–193
Computing MC (2015) Parallel optimization of string mode matching algorithm based on multi-core computing. J Softw Eng 2(2):388–391
Butchi Raju K, Dr S, Raju V (2015) Numeral N-folded parallel string matching. Int J Appl Eng Res 10(19):40415–40433
Jiaxing Q, Zhang G, Fang Z, Liu J (2016) A parallel algorithm of string matching based on message passing interface for multicore processors. Int J Hybrid Inf Technol 9(3):31–38
Le Dang N, Le D-N, Le VT (2016) A new multiple-pattern matching algorithm for the network intrusion detection system. IACSIT Int J Eng Technol 8(2):94–100
Vakili S, Langlois JM, Boughzala B, Savaria Y (2016) Memory-efficient string matching for intrusion detection systems using a high-precision pattern grouping algorithm. In: Proceedings of the 2016 symposium on architectures for networking and communications systems, pp 37–42
Aho AV, Corasick MJ (1975) Efficient string matching: an aid to bibliographic search. Commun ACM 18(6):333–340
Wu S, Manber U (1994) A fast algorithm for multi-pattern searching. Technical Report, TR 94–17, PP 1–11
Karp RM, Rabin MO (1987) Efficient randomized pattern-matching algorithms. IBM J Res Dev 31(2):249–260
Yang J, Ding S (2012) An improved pattern matching algorithm based on BMHS. In: International symposium on distributed computing and applications to business, engineering & science, pp 441–445
Lecroq T (1995) Experimental results on string matching algorithms. Software 25(7):727–765
Zhen C, Di W (2008) Improving Wu-Manber: a multi-pattern matching algorithm. In: International conference on networking, sensing and control (ICNSC), pp 812 –817
Dai L (2009) An aggressive algorithm for multiple string matching. Inf Process Lett 109(11):553–559
Chen X, Zhang B, Pan X, Wu Z (2009) Highconcurrence Wu-Manber multiple patterns matching algorithm. In: Proceedings of the international symposium on information Process, p 404
Zhang B, Chen X, Ping L, Wu Z (2009) Address filtering based Wu-Manber multiple patterns matching algorithm. In: International workshop on computer science and engineering, pp 408–412
Yoon-Ho S-W (2013) BLAST: B-LAyered bad-character SHIFT tables for high-speed pattern matching. J Inf Secur 7(3):195–202
Lecroq T (2007) Fast exact string matching algorithms. Inf Process Lett 102:229–235
Kharbutli M et al (2012) Function and data parallelization of Wu-Manber pattern matching for intrusion detection systems. Netw Protoc Algorithms 4(3):46–61
Carl B et al (2014) Fast algorithms for approximate circular string matching. Algorithms Mol Biol 1:9
Holt William M (2016) 1.1 Moore’s law: a path going forward. In: IEEE international solid-state circuits conference (ISSCC), pp 8–13
Gallopoulos E, Philippe B, Sameh AH (2016) Parallel programming paradigms. In: Parallelism in matrix computations, pp 3–16
Schildt H (2014) Java: The Complete Reference, 9th edn. Oracle Corporation, New York
Donkor ES, Dayie NT, Adiku TK (2014) Bioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA). J Bioinform Seq Anal 6(1):1–6
Norrgard K (2008) Forensics, DNA fingerprinting, and CODIS. Nat Educ 1(1):35–41
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rao, C.S., Viswanadha Raju, S. Concurrent Information Retrieval System (IRS) for Large Volume of Data with Multiple Pattern Multiple (\(2^\mathrm{N}\)) Shaft Parallel String Matching. Ann. Data. Sci. 3, 175–203 (2016). https://doi.org/10.1007/s40745-016-0080-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40745-016-0080-1