Skip to main content

Probabilistic Approaches to Alignment with Tandem Repeats

  • Conference paper
  • 2028 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8126))

Abstract

We propose a simple tractable pair hidden Markov model for pairwise sequence alignment that accounts for the presence of short tandem repeats. Using the framework of gain functions, we design several optimization criteria for decoding this model and describe the resulting decoding algorithms, ranging from the traditional Viterbi and posterior decoding to block-based decoding algorithms specialized for our model. We compare the accuracy of individual decoding algorithms on simulated data and find our approach superior to the classical three-state pair HMM in simulations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Benson, G.: Sequence alignment with tandem duplication. Journal of Computational Biology 4(3), 351–357 (1997)

    Article  Google Scholar 

  2. Benson, G.: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 27(2), 573–580 (1999)

    Article  MathSciNet  Google Scholar 

  3. Bérard, S., Nicolas, F., Buard, J., Gascuel, O., Rivals, E.: A fast and specific alignment method for minisatellite maps. Evolutionary Bioinformatics Online 2, 303 (2006)

    Google Scholar 

  4. Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press (1998)

    Google Scholar 

  5. Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32(5), 1792–1797 (2004)

    Article  Google Scholar 

  6. Flicek, P., et al.: Ensembl 2013. Nucleic Acids Research 41(D1), D48–D55 (2013)

    Google Scholar 

  7. Freschi, V., Bogliolo, A.: A lossy compression technique enabling duplication-aware sequence alignment. Evolutionary Bioinformatics Online 8, 171 (2012)

    Article  Google Scholar 

  8. Frith, M.C.: A new repeat-masking method enables specific detection of homologous sequences. Nucleic Acids Res. 39(4), e23 (2011)

    Google Scholar 

  9. Gemayel, R., Vinces, M.D., Legendre, M., Verstrepen, K.J.: Variable tandem repeats accelerate evolution of coding and regulatory sequences. Annual Review of Genetics 44, 445–477 (2010)

    Article  Google Scholar 

  10. Hamada, M., Kiryu, H., Sato, K., Mituyama, T., Asai, K.: Prediction of RNA secondary structure using generalized centroid estimators. Bioinformatics 25(4), 465–473 (2009)

    Article  Google Scholar 

  11. Harris, R.: Improved pairwise alignment of genomic DNA. PhD thesis, Pennsylvania State University (2007)

    Google Scholar 

  12. Hickey, G., Blanchette, M.: A probabilistic model for sequence alignment with context-sensitive indels. Journal of Computational Biology 18(11), 1449–1464 (2011)

    Article  MathSciNet  Google Scholar 

  13. Holmes, I., Durbin, R.: Dynamic programming alignment accuracy. Journal of Computational Biology 5(3), 493–504 (1998)

    Article  Google Scholar 

  14. Hudek, A.K.: Improvements in the Accuracy of Pairwise Genomic Alignment. PhD thesis, University of Waterloo, Canada (2010)

    Google Scholar 

  15. Kováč, P., Brejová, B., Vinař, T.: Aligning sequences with repetitive motifs. In: Information Technologies - Applications and Theory (ITAT), pp. 41–48 (2012)

    Google Scholar 

  16. Lunter, G., Rocco, A., Mimouni, N., Heger, A., Caldeira, A., Hein, J.: Uncertainty in homology inferences: assessing and improving genomic sequence alignment. Genome Research 18(2), 298–309 (2008)

    Article  Google Scholar 

  17. Messer, P.W., Arndt, P.F.: The majority of recent short DNA insertions in the human genome are tandem duplications. Mol. Biol. Evol. 24(5), 1190–1197 (2007)

    Article  MathSciNet  Google Scholar 

  18. Miyazawa, S.: A reliable sequence alignment method based on probabilities of residue correspondences. Protein Engineering 8(10), 999–1009 (1995)

    Article  MathSciNet  Google Scholar 

  19. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–443 (1970)

    Article  Google Scholar 

  20. Pachter, L., Alexandersson, M., Cawley, S.: Applications of generalized pair hidden Markov models to alignment and gene finding problems. Journal of Computational Biology 9(2), 389–399 (2002)

    Article  Google Scholar 

  21. Sammeth, M., Stoye, J.: Comparing tandem repeats with duplications and excisions of variable degree. IEEE/ACM Transactions on Computational Biology and Bioinformatics 3(4), 395–407 (2006)

    Article  Google Scholar 

  22. Satija, R., Hein, J., Lunter, G.A.: Genome-wide functional element detection using pairwise statistical alignment outperforms multiple genome footprinting techniques. Bioinformatics 26(17), 2116–2120 (2010)

    Article  Google Scholar 

  23. Schwartz, A.S., Pachter, L.: Multiple alignment by sequence annealing. Bioinformatics, 23(2), e24–e29 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nánási, M., Vinař, T., Brejová, B. (2013). Probabilistic Approaches to Alignment with Tandem Repeats. In: Darling, A., Stoye, J. (eds) Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science(), vol 8126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40453-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40453-5_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40452-8

  • Online ISBN: 978-3-642-40453-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics