Skip to main content

Approximate Tandem Repeats

2001; Landau, Schmidt, Sokol 2003; Kolpakov, Kucherov

  • Reference work entry
  • 253 Accesses

Keywords and Synonyms

Approximate repetitions; Approximate periodicities      

Problem Definition

Identification of periodic structures in words (variants of which are known as tandem repeats, repetitions, powers or runs) is a fundamental algorithmic task (see entry Squares and Repetitions). In many practical applications, such as DNA sequence analysis, considered repetitions admit a certain variation between copies of the repeated pattern. In other words, repetitions under interest are approximate tandem repeats and not necessarily exact repeats only.

The simplest instance of an approximate tandem repeat is an approximate square. An approximate square in a word w is a subword uv, where u and v are within a given distance kaccording to some distance measure between words, such as Hamming distance or edit (also called Levenstein) distance. There are several ways to define approximate tandem repeats as successions of approximate squares, i. e. to generalize to the approximate case the notion...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   399.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Benson, G.: Tandem Repeats Finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999)

    Article  Google Scholar 

  2. Boeva, V.A., Régnier, M., Makeev, V.J.: SWAN: searching for highly divergent tandem repeats in DNA sequences with the evaluation of their statistical significance. Proceedings of JOBIM 2004, Montreal, Canada, p. 40 (2004)

    Google Scholar 

  3. Butler, J.M.: Forensic DNA Typing: Biology and Technology Behind STR Markers. Academic Press (2001)

    Google Scholar 

  4. Crochemore, M.: Recherche linéaire d'un carré dans un mot. Comptes Rendus Acad. Sci. Paris Sér. I Math. 296, 781–784 (1983)

    MathSciNet  MATH  Google Scholar 

  5. Delgrange, O., Rivals, E.: STAR – an algorithm to Search for Tandem Approximate Repeats. Bioinform. 20, 2812–2820 (2004)

    Article  Google Scholar 

  6. Gelfand, Y., Rodriguez, A., Benson, G.: TRDB – The Tandem Repeats Database. Nucl. Acids Res. 35(suppl. 1), D80–D87 (2007)

    Google Scholar 

  7. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press (1997)

    Book  MATH  Google Scholar 

  8. Kolpakov, R., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: 40th Symp. Foundations of Computer Science (FOCS), pp. 596–604. IEEE Computer Society Press (1999)

    Google Scholar 

  9. Kolpakov, R., Bana, G., Kucherov, G.: mreps: efficient and flexible detection of tandem repeats in DNA. Nucl. Acids Res. 31(13), 3672–3678 (2003)

    Article  Google Scholar 

  10. Kolpakov, R., Kucherov, G.: Finding approximate repetitions under Hamming distance. Theoret. Comput. Sci. 33(1), 135–156, (2003)

    Google Scholar 

  11. Kolpakov, R., Kucherov, G.: Identification of periodic structures in words. In: Berstel, J., Perrin, D. (eds.) Applied combinatorics on words. Encyclopedia of Mathematics and its Applications. Lothaire books, vol. 104, pp. 430–477. Cambridge University Press (2005)

    Google Scholar 

  12. Landau, G.M., Vishkin, U.: Fast string matching with k differences. J. Comput. Syst. Sci. 37(1), 63–78 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  13. Landau, G.M., Myers, E.W., Schmidt, J.P.: Incremental string comparison. SIAM J. Comput. 27(2), 557–582 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  14. Landau, G.M., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8, 1–18 (2001)

    Article  Google Scholar 

  15. Main, M.: Detecting leftmost maximal periodicities. Discret. Appl. Math. 25, 145–153 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  16. Main, M., Lorentz, R.: An \( { O(n \log n) } \) algorithm for finding all repetitions in a string. J. Algorithms 5(3), 422–432 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  17. Messer, P.W., Arndt, P.F.: The majority of recent short DNA insertions in the human genome are tandem duplications. Mol. Biol. Evol. 24(5), 1190–7 (2007)

    Article  MathSciNet  Google Scholar 

  18. Rodeh, M., Pratt, V., Even, S.: Linear algorithm for data compression via string matching. J. Assoc. Comput. Mach. 28(1), 16–24 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  19. Sokol, D., Benson, G., Tojeira, J.: Tandem repeats over the edit distance. Bioinform. 23(2), e30–e35 (2006)

    Google Scholar 

  20. Wexler, Y., Yakhini, Z., Kashi, Y., Geiger, D.: Finding approximate tandem repeats in genomic sequences. J. Comput. Biol. 12(7), 928–42 (2005)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the National Science Foundation Grant DB&I 0542751.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag

About this entry

Cite this entry

Kucherov, G., Sokol, D. (2008). Approximate Tandem Repeats. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30162-4_24

Download citation

Publish with us

Policies and ethics