Advertisement

A Filtering Technique for All Pairs Approximate Parameterized String Matching

  • Shibsankar Das
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 834)

Abstract

The paper deals with all pairs approximate parameterized string matching problem with error threshold k, among two sets of equal length strings. Let \(P=\{p_1, ~ p_2, \ldots , p_{n_P}\} \subseteq \varSigma _P^m\) and \(T=\{t_1, ~ t_2, \ldots , t_{n_T}\}\) \(\subseteq \varSigma _T^m\) be two sets of strings where \(|\varSigma _P|=|\varSigma _T|\). For each \(p_i \in P\), the problem is to find \(t_j \in T\) which is approximately parameterized closest to \(p_i\) under the threshold. The solution has complexity \(O(n_P \, n_T \, m)\). We introduce Parikh vector filtering technique in order to preprocess the given strings and avoid the unwanted paired comparisons. The PV-filtering does not change the asymptotic time complexity but rapidly improves running time for small error threshold as shown by experiments.

Keywords

Approximate parameterized string matching Hamming distance \(\gamma (k)\)-match of vectors Parikh vector PV-filtering technique 

Notes

Acknowledgement

The author is grateful to Dr. Jan Holub for his helpful comments and suggestions.

References

  1. 1.
    Apostolico, A., Erdős, P.L., Jüttner, A.: Parameterized searching with mismatches for run-length encoded strings. Theor. Comput. Sci. 454, 23–29 (2012). Formal and Natural Computing Honoring the 80th Birthday of Andrzej EhrenfeuchtMathSciNetCrossRefGoogle Scholar
  2. 2.
    Apostolico, A., Erdős, P.L., Lewenstein, M.: Parameterized matching with mismatches. J. Discret. Algorithms 5(1), 135–140 (2007)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Baker, B.S.: A theory of parameterized pattern matching: algorithms and applications. In: Symposium on Theory of Computing, pp. 71–80. ACM (1993)Google Scholar
  4. 4.
    Baker, B.S.: Parameterized pattern matching: algorithms and applications. J. Comput. Syst. Sci. 52(1), 28–42 (1996)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Baker, B.S.: Parameterized duplication in strings: algorithms and an application to software maintenance. SIAM J. Comput. 26(5), 1343–1362 (1997)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Cambouropoulos, E., Crochemore, M., Iliopoulos, C.S., Mouchard, L., Pinzon, Y.J.: Algorithms for computing approximate repetitions in musical sequences. Int. J. Comput. Math. 79(11), 1135–1148 (2002)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Charras, C., Lecroq, T.: Handbook of Exact String Matching Algorithms. King’s College Publications, London (2004)zbMATHGoogle Scholar
  8. 8.
    Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press, New York (2007)CrossRefGoogle Scholar
  9. 9.
    Crochemore, M., Rytter, W.: Jewels of Stringology: Text Algorithms. World Scientific Press, Singapore (2002)CrossRefGoogle Scholar
  10. 10.
    Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)CrossRefGoogle Scholar
  11. 11.
    Das, S., Holub, J., Kapoor, K.: All pairs approximate parameterized string matching. Technical report FIT-2014-01, Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague, Thǎkurova 2700/9, 160 00 Praha 6, Czech Republic, March 2014Google Scholar
  12. 12.
    Das, S., Kapoor, K.: Fine-tuning decomposition theorem for maximum weight bipartite matching. In: Gopal, T.V., Agrawal, M., Li, A., Cooper, S.B. (eds.) TAMC 2014. LNCS, vol. 8402, pp. 312–322. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-06089-7_22CrossRefGoogle Scholar
  13. 13.
    Das, S., Kapoor, K.: Weighted approximate parameterized string matching. AKCE Int. J. Graphs Comb. 14(1), 1–12 (2017)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Faro, S., Lecroq, T.: The exact online string matching problem: a review of the most recent results. ACM Comput. Surv. 45(2), 13:1–13:42 (2013)CrossRefGoogle Scholar
  15. 15.
    Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM 34(3), 596–615 (1987)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Gabow, H.N.: Scaling algorithms for network problems. J. Comput. Syst. Sci. 31(2), 148–168 (1985)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Gabow, H.N., Tarjan, R.E.: Faster scaling algorithms for network problems. SIAM J. Comput. 18(5), 1013–1036 (1989)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Hamming, R.W.: Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1950)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Hazay, C., Lewenstein, M., Sokol, D.: Approximate parameterized matching. ACM Trans. Algorithms 3(3) (2007).  https://doi.org/10.1145/1273340.1273345MathSciNetCrossRefGoogle Scholar
  20. 20.
    Kao, M.Y., Lam, T.W., Sung, W.K., Ting, H.F.: A decomposition theorem for maximum weight bipartite matchings. SIAM J. Comput. 31(1), 18–26 (2001)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Lee, I., Mendivelso, J., Pinzón, Y.J.: \(\delta \gamma \) – parameterized matching. In: Amir, A., Turpin, A., Moffat, A. (eds.) SPIRE 2008. LNCS, vol. 5280, pp. 236–248. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-89097-3_23CrossRefGoogle Scholar
  22. 22.
    Levenshtein, V.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 10(8), 707–710 (1966)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33, 31–88 (2001)CrossRefGoogle Scholar
  24. 24.
    Parikh, R.J.: On context-free languages. J. ACM 13(4), 570–581 (1966)CrossRefGoogle Scholar
  25. 25.
    Prasad, R., Agarwal, S.: Study of bit-parallel approximate parameterized string matching algorithms. In: Ranka, S., Aluru, S., Buyya, R., Chung, Y.-C., Dua, S., Grama, A., Gupta, S.K.S., Kumar, R., Phoha, V.V. (eds.) IC3 2009. CCIS, vol. 40, pp. 26–36. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-03547-0_4CrossRefGoogle Scholar
  26. 26.
    Smyth, B.: Computing Patterns in Strings. Pearson Addison-Wesley, New York (2003)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Department of MathematicsInstitute of Science, Banaras Hindu UniversityVaranasiIndia

Personalised recommendations