Keywords and Synonyms
Approximate repetitions; Approximate periodicities
Problem Definition
Identification of periodic structures in words (variants of which are known as tandem repeats, repetitions, powers or runs) is a fundamental algorithmic task (see entry Squares and Repetitions). In many practical applications, such as DNA sequence analysis, considered repetitions admit a certain variation between copies of the repeated pattern. In other words, repetitions under interest are approximate tandem repeats and not necessarily exact repeats only.
The simplest instance of an approximate tandem repeat is an approximate square. An approximate square in a word w is a subword uv, where u and v are within a given distance kaccording to some distance measure between words, such as Hamming distance or edit (also called Levenstein) distance. There are several ways to define approximate tandem repeats as successions of approximate squares, i. e. to generalize to the approximate case the notion...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Benson, G.: Tandem Repeats Finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999)
Boeva, V.A., Régnier, M., Makeev, V.J.: SWAN: searching for highly divergent tandem repeats in DNA sequences with the evaluation of their statistical significance. Proceedings of JOBIM 2004, Montreal, Canada, p. 40 (2004)
Butler, J.M.: Forensic DNA Typing: Biology and Technology Behind STR Markers. Academic Press (2001)
Crochemore, M.: Recherche linéaire d'un carré dans un mot. Comptes Rendus Acad. Sci. Paris Sér. I Math. 296, 781–784 (1983)
Delgrange, O., Rivals, E.: STAR – an algorithm to Search for Tandem Approximate Repeats. Bioinform. 20, 2812–2820 (2004)
Gelfand, Y., Rodriguez, A., Benson, G.: TRDB – The Tandem Repeats Database. Nucl. Acids Res. 35(suppl. 1), D80–D87 (2007)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press (1997)
Kolpakov, R., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: 40th Symp. Foundations of Computer Science (FOCS), pp. 596–604. IEEE Computer Society Press (1999)
Kolpakov, R., Bana, G., Kucherov, G.: mreps: efficient and flexible detection of tandem repeats in DNA. Nucl. Acids Res. 31(13), 3672–3678 (2003)
Kolpakov, R., Kucherov, G.: Finding approximate repetitions under Hamming distance. Theoret. Comput. Sci. 33(1), 135–156, (2003)
Kolpakov, R., Kucherov, G.: Identification of periodic structures in words. In: Berstel, J., Perrin, D. (eds.) Applied combinatorics on words. Encyclopedia of Mathematics and its Applications. Lothaire books, vol. 104, pp. 430–477. Cambridge University Press (2005)
Landau, G.M., Vishkin, U.: Fast string matching with k differences. J. Comput. Syst. Sci. 37(1), 63–78 (1988)
Landau, G.M., Myers, E.W., Schmidt, J.P.: Incremental string comparison. SIAM J. Comput. 27(2), 557–582 (1998)
Landau, G.M., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8, 1–18 (2001)
Main, M.: Detecting leftmost maximal periodicities. Discret. Appl. Math. 25, 145–153 (1989)
Main, M., Lorentz, R.: An \( { O(n \log n) } \) algorithm for finding all repetitions in a string. J. Algorithms 5(3), 422–432 (1984)
Messer, P.W., Arndt, P.F.: The majority of recent short DNA insertions in the human genome are tandem duplications. Mol. Biol. Evol. 24(5), 1190–7 (2007)
Rodeh, M., Pratt, V., Even, S.: Linear algorithm for data compression via string matching. J. Assoc. Comput. Mach. 28(1), 16–24 (1981)
Sokol, D., Benson, G., Tojeira, J.: Tandem repeats over the edit distance. Bioinform. 23(2), e30–e35 (2006)
Wexler, Y., Yakhini, Z., Kashi, Y., Geiger, D.: Finding approximate tandem repeats in genomic sequences. J. Comput. Biol. 12(7), 928–42 (2005)
Acknowledgments
This work was supported in part by the National Science Foundation Grant DB&I 0542751.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag
About this entry
Cite this entry
Kucherov, G., Sokol, D. (2008). Approximate Tandem Repeats. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30162-4_24
Download citation
DOI: https://doi.org/10.1007/978-0-387-30162-4_24
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30770-1
Online ISBN: 978-0-387-30162-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering