Using Finite Automata Approach for Searching Approximate Seeds of Strings
Seed is a type of a regularity of strings. A restricted approximate seed w of string T is a factor of T such that w covers a superstring of T under some distance rule. In this paper, the problem of searching of all restricted seeds with the smallest Hamming distance is studied and a polynomial time and space algorithm for solving the problem is presented. It searches for all restricted approximate seeds of a string with given limited approximation using Hamming distance and it computes the smallest distance for each found seed. The solution is based on a finite (suffix) automata approach that provides a straightforward way to design algorithms to many problems in stringology. Therefore, it is shown that the set of problems solvable using finite automata includes the one studied in this paper.
KeywordsHamming distance Approximate seed Suffix automaton Stringology
This research was supported by the Czech Technical University in Prague as grant No. CTU0803113 and as grant No. CTU0915313, by the Ministry of Education, Youth and Sports of the Czech Republic under research program MSM 6840770014, and by the Czech Science Foundation as project No. 201/06/1039 and as project No. 201/09/0807.
- 3.Guth, O., Melichar, B., & Balík, M. (2008). Searching all approximate covers and their distance using finite automata. Information technologies – applications and theory (pp. 21–26). Košice: Univerzita P. J. Šafárika. http://ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-414/paper4.pdf.
- 4.Iliopoulos, C.S., Moore, D., & Park, K.S. (1993). Covering a string. CPM ’93: Proceedings of the 4th annual symposium on combinatorial pattern matching (pp. 54–62). London, UK: Springer.Google Scholar
- 5.Voráček, M., & Melichar, B. (2006). Computing seeds in generalized strings. Proceedings of workshop 2006 (pp. 138–139). Prague: Czech Technical University.Google Scholar