Advertisement

BPBM: An Algorithm for String Matching with Wildcards and Length Constraints

  • Xiao-Li Hong
  • Xindong Wu
  • Xue-Gang Hu
  • Ying-Ling Liu
  • Jun Gao
  • Gong-Qing Wu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5908)

Abstract

Pattern matching with wildcards and length constraints under the one-off condition is a challenging topic. We propose an algorithm BPBM, based on bit parallelism and the Boyer-Moore algorithm, that outputs an occurrence of a given pattern P as soon as the pattern appears in the given sequence. The experimental results show that our BPBM algorithm has an improved time performance of over 50% with the same matching results when compared with SAIL, a state-of-the-art algorithm of this matching problem. The superiority is even more remarkable when the scale of the pattern increases.

Keywords

Pattern matching wildcards length constraints bit-parallelism 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Akutsu, T.: Approximate string matching with variable length don’t care characters. IEICE Trans. Info. Syst. E79-D(9), 1353–1354 (1996)Google Scholar
  2. 2.
    Boyer, R.S., Moore, J.S.: A fast string searching algorithm. CACM 20(10), 762–772 (1977)Google Scholar
  3. 3.
    Chen, G., Wu, X., Zhu, X., Arslan, A.N., He, Y.: Efficient String Matching with Wildcards and Length Constraints. Knowledge and Information Systems 10(4), 399–419 (2006)CrossRefGoogle Scholar
  4. 4.
    Cole, R., Gottlieb, L., Lewenstein, M.: Dictionary matching and indexing with errors and don’t cares. In: Proceedings of the 36th ACM Symposium on the Theory of Computing, pp. 91–100. ACM Press, New York (2004)Google Scholar
  5. 5.
    Fischer, M.J., Paterson, M.S.: String matching and other products. In: Karp, R.M. (ed.) Complexity of computation, vol. 7, pp. 113–125. Massachusetts Institute of Technology, Cambridge (1974)Google Scholar
  6. 6.
    Gusfield, D.: Algorithms on strings, trees, and sequences–Computer science and computational biology. Cambridge University Press, Cambridge (1997)zbMATHGoogle Scholar
  7. 7.
    Kalai, A.: Efficient pattern-matching with don’t cares. In: Proceedings of the 13th ACM-SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, pp. 655–656. Society for Industrial and Applied Mathematics, Philadelphia (2002)Google Scholar
  8. 8.
    Kucherov, G., Rusinowitch, M.: Matching a set of strings with variable length don’t cares. In: Proceedings of the 6th Symposium on Combinatorial Pattern Matching, pp. 230–247. Springer, Heidelberg (1995)Google Scholar
  9. 9.
    Manber, U., Baeza-Yates, R.: An algorithm for string matching with a sequence of don’t cares. Inf. Proc. Lett. 37(3), 133–136 (1991)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Navarro, G., Raffinot, M.: Fast and Simple Character Classes and Bounded Gaps Pattern Matching, with Applications to Protein Searching. J. Computational Biology 10(6) (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Xiao-Li Hong
    • 1
  • Xindong Wu
    • 1
    • 2
  • Xue-Gang Hu
    • 1
  • Ying-Ling Liu
    • 1
  • Jun Gao
    • 1
  • Gong-Qing Wu
    • 1
  1. 1.School of Computer Science and Information EngineeringHefei University of TechnologyHefeiChina
  2. 2.Department of Computer ScienceUniversity of VermontBurlingtonUSA

Personalised recommendations