Borders and Finite Automata
A border of a string is a prefix of the string that is simultaneously its suffix. It is one of the basic stringology keystones used as a part of many algorithms in pattern matching, molecular biology, computer-assisted music analysis and others. The paper discusses automata-theoretical background of Iliopoulos’s ALL_BORDERS algorithm that finds all borders of a string with don’t care symbols. We show that ALL_BORDERS algorithm is a simulator of a finite automaton together with explaining the function of the automaton. We show that the simulated automaton accepts intersection of sets of prefixes and suffixes (and thus a set of borders) of the input string. Last but not least we define approximate borders. Based on the knowledge of the automata background of ALL_BORDERS algorithm we offer an automata-based algorithm that finds approximate borders with Hamming distance. We discuss conditions under which the same principle can be used for other distance measures for which an approximate searching automaton can be constructed.
Unable to display preview. Download preview PDF.
- [FP74]Fischer, M.J., Paterson, M.S.: String matching and other products. In: Karp, R.M. (ed.) Complexity of Computation. SIAM AMS Proceedings, vol. 7, pp. 113–125. American Mathematical Society, Providence (1974)Google Scholar
- [Hol00]Holub, J.: Simulation of Nondeterministic Finite Automata in Pattern Matching. PhD thesis, Czech Technical University in Prague (February 2000)Google Scholar
- [MHP05]Melichar, B., Holub, J., Polcar, T.: Text Searching Algorithms, vol. I (2005), http://www.stringology.org/athens/
- [MP70]Morris, J.H., Pratt, V.R.: A Linear Pattern Matching Algorithm. Technical Report 40, Computing Center, University of California, Berkeley (1970)Google Scholar
- [ŠM06]Šimůnek, M., Melichar, B.: Borders and finite automata. In: Proceedings of Workshop 2006. Czech Technical University, Prague (2006)Google Scholar