Skip to main content

Accurate and Efficient Methods to Improve Multiple Circular Sequence Alignment

  • Conference paper
  • First Online:
Experimental Algorithms (SEA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9125))

Included in the following conference series:

Abstract

Multiple sequence alignment is a core computational task in bioinformatics and has been extensively studied over the past decades. This computation requires an implicit assumption on the input data: the left- and right-most position for each sequence is relevant. However, this is not the case for circular structures; for instance, MtDNA. Efforts have been made to address this issue but it is far from being solved. We have very recently introduced a fast algorithm for approximate circular string matching (Barton et al., Algo Mol Biol, 2014). Here, we first show how to extend this algorithm for approximate circular dictionary matching; and, then, apply this solution with agglomerative hierarchical clustering to find a sufficiently good rotation for each sequence. Furthermore, we propose an alternative method that is suitable for more divergent sequences. We implemented these methods in BEAR, a programme for improving multiple circular sequence alignment. Experimental results, using real and synthetic data, show the high accuracy and efficiency of these new methods in terms of the inferred likelihood-based phylogenies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R.A., Perleberg, C.H.: Fast and practical approximate string matching. Information Processing Letters 59(1), 21–27 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  2. Barton, C., Iliopoulos, C.S., Pissis, S.P.: Fast algorithms for approximate circular string matching. Algorithms for Molecular Biology 9(1), 9 (2014)

    Article  Google Scholar 

  3. Barton, C., Iliopoulos, C.S., Pissis, S.P.: Average-case optimal approximate circular string matching. In: Dediu, A.-H., Formenti, E., Martín-Vide, C., Truthe, B. (eds.) LATA 2015. LNCS, vol. 8977, pp. 85–96. Springer, Heidelberg (2015)

    Google Scholar 

  4. Crochemore, M., Iliopoulos, C.S., Pissis, S.P.: A parallel algorithm for fixed-length approximate string-matching with k-mismatches. In: Elomaa, T., Mannila, H., Orponen, P. (eds.) Ukkonen Festschrift 2010. LNCS, vol. 6060, pp. 92–101. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Dori, S., Landau, G.M.: Construction of Aho Corasick automaton in linear time for integer alphabets. Information Processing Letters 98(2), 66–72 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  6. Edgar, R.C.: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5(1), 113 (2004)

    Article  Google Scholar 

  7. Fernandes, F., Pereira, L., Freitas, A.T.: CSA: An efficient algorithm to improve circular DNA multiple alignment. BMC Bioinformatics 10(1), 1–13 (2009)

    Article  Google Scholar 

  8. Fletcher, W., Yang, Z.: INDELible: A flexible simulator of biological sequence evolution. Molecular Biology and Evolution 26(8), 1879–1888 (2009)

    Article  Google Scholar 

  9. Fritzsch, G., Schlegel, M., Stadler, P.F.: Alignments of mitochondrial genome arrangements: Applications to metazoan phylogeny. Journal of Theoretical Biology 240(4), 511–520 (2006)

    Article  MathSciNet  Google Scholar 

  10. Goios, A., Pereira, L., Bogue, M., Macaulay, V., Amorim, A.: mtDNA phylogeny and evolution of laboratory mouse strains. Genome Research 17(3), 293–298 (2007)

    Article  Google Scholar 

  11. Hirvola, T., Tarhio, J.: Approximate online matching of circular strings. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 315–325. Springer, Heidelberg (2014)

    Google Scholar 

  12. Iliopoulos, C.S., Mouchard, L., Pinzon, Y.J.: The max-shift algorithm for approximate string matching. In: Brodal, G.S., Frigioni, D., Marchetti-Spaccamela, A. (eds.) WAE 2001. LNCS, vol. 2141, pp. 13–25. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Katoh, K., Misawa, K., Kuma, K.I., Miyata, T.: MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30(14), 3059–3066 (2002)

    Article  Google Scholar 

  14. Larkin, M., Blackshields, G., Brown, N., Chenna, R., McGettigan, P., McWilliam, H., Valentin, F., Wallace, I., Wilm, A., Lopez, R., Thompson, J., Gibson, T., Higgins, D.: Clustal W and Clustal X version 2.0 23(21), 2947–2948 (2007)

    Google Scholar 

  15. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Tech. Rep. 8 (1966)

    Google Scholar 

  16. Maes, M.: On a cyclic string-to-string correction problem. Information Processing Letters 35(2), 73–78 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  17. Mosig, A., Hofacker, I.L., Stadler, P.F.: Comparative analysis of cyclic sequences: viroids and other small circular RNAs. In: Huson, D.H., Kohlbacher, O., Lupas, A.N., Nieselt, K., Zell, A. (eds.) German Conference on Bioinformatics. LNI, vol. 83, pp. 93–102. GI (2006)

    Google Scholar 

  18. Myers, G.: A fast bit-vector algorithm for approximate string matching based on dynamic programming. Journal of ACM 46(3), 395–415 (1999)

    Article  MATH  Google Scholar 

  19. Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: a novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 302(1), 205–217 (2000)

    Article  Google Scholar 

  20. Rice, P., Longden, I., Bleasby, A.: EMBOSS: The European Molecular Biology Open Software Suite. Trends in Genetics 16(6), 276–277 (2000)

    Article  Google Scholar 

  21. Robinson, D., Foulds, L.: Comparison of phylogenetic trees. Mathematical Biosciences 53(1–2), 131–147 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  22. Sokal, R.R., Michener, C.D.: A statistical method for evaluating systematic relationships. University of Kansas Scientific Bulletin 28, 1409–1438 (1958)

    Google Scholar 

  23. Stamatakis, A.: Raxml version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9), 1312–1313 (2014)

    Article  Google Scholar 

  24. Ukkonen, E.: On approximate string matching. In: Karpinski, M. (ed.) Foundations of Computation Theory. Lecture Notes in Computer Science, vol. 158, pp. 487–495. Springer, Berlin Heidelberg (1983)

    Chapter  Google Scholar 

  25. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  26. Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. Journal of Computational Biology 1(4), 337–348 (1994)

    Article  Google Scholar 

  27. Wang, Z., Wu, M.: Phylogenomic reconstruction indicates mitochondrial ancestor was an energy parasite. PLoS ONE 10(9), e110685 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Solon P. Pissis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Barton, C., Iliopoulos, C.S., Kundu, R., Pissis, S.P., Retha, A., Vayani, F. (2015). Accurate and Efficient Methods to Improve Multiple Circular Sequence Alignment. In: Bampis, E. (eds) Experimental Algorithms. SEA 2015. Lecture Notes in Computer Science(), vol 9125. Springer, Cham. https://doi.org/10.1007/978-3-319-20086-6_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20086-6_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20085-9

  • Online ISBN: 978-3-319-20086-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics