Skip to main content
Log in

Alignments with non-overlapping moves, inversions and tandem duplications in O(n 4) time

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

Sequence alignment is a central problem in bioinformatics. The classical dynamic programming algorithm aligns two sequences by optimizing over possible insertions, deletions and substitutions. However, other evolutionary events can be observed, such as inversions, tandem duplications or moves (transpositions). It has been established that the extension of the problem to move operations is NP-complete. Previous work has shown that an extension restricted to non-overlapping inversions can be solved in O(n 3) with a restricted scoring scheme. In this paper, we show that the alignment problem extended to non-overlapping moves can be solved in O(n 5) for general scoring schemes, O(n 4log n) for concave scoring schemes and O(n 4) for restricted scoring schemes. Furthermore, we show that the alignment problem extended to non-overlapping moves, inversions and tandem duplications can be solved with the same time complexities. Finally, an example of an alignment with non-overlapping moves is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aggarwal A, Klawe MM, Moran S, Shor P, Wilber R (1987) Geometric applications of a matrix-searching algorithm. Algorithmica 2(1):195–208

    Article  MATH  MathSciNet  Google Scholar 

  • Alves CER, do Lago AP, Vellozo AF (2005) Alignment with non-overlapping inversions in o(n 3log n) time. In: Proceedings of GRACO 2005. Electronic Notes in Discrete Mathematics, vol 19. Elsevier, Amsterdam, pp 365–371

    Google Scholar 

  • Andrade MA, Perez-Iratxeta C, Ponting CP (2001) Protein repeats: structures, functions, and evolution. J Struct Biol 134(23):117–131

    Article  Google Scholar 

  • Apic G, Gough J, Teichmann SA (2001) Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J Mol Biol 310(2):311–325

    Article  Google Scholar 

  • Bashton M, Chothia C (2002) The geometry of domain combination in proteinsm. J Mol Biol 315(4):927–939

    Article  Google Scholar 

  • Chen ZZ, Gao Y, Lin G, Niewiadomski R, Wang Y, Wu J (2004) A space-efficient algorithm for sequence alignment with inversions and reversals. Theor Comput Sci 325(3):361–372

    Article  MATH  Google Scholar 

  • Chrobak M, Kolman P, Sgall J (2005) The greedy algorithm for the minimum common string partition problem. ACM Trans Algorithms 1(2):350–366

    Article  MathSciNet  Google Scholar 

  • Cormode G, Muthukrishnan S (2002) The string edit distance matching problem with moves. In: SODA ’02: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, Philadelphia, PA, USA. Society for Industrial and Applied Mathematics, pp 667–676

  • do Lago AP, Muchnik I (2005) A sparse dynamic programming algorithm for alignment with non-overlapping inversions. Theor Inform Appl 39(1):175–189

    Article  MATH  MathSciNet  Google Scholar 

  • Fliess A, Motro B, Unger R (2002) Swaps in protein sequences. Proteins 48(2):377–387

    Article  Google Scholar 

  • Gonnet GH, Hallett MT, Korostensky C, Bernardin L (2000) Darwin v2.0: an interpreted computer language for the biosciences. Bioinformatics 16(2):101–103

    Article  Google Scholar 

  • Gusfield D (1997/1999) Algorithms on strings, trees, and sequences: computer science and computational biology. Press Syndicate of the University of Cambridge, Cambridge

    MATH  Google Scholar 

  • Kaessmann H, Zöllner S, Nekrutenko A, Li WH (2002) Signatures of domain shuffling in the human genome. Genome Res 12(11):1642–1650

    Article  Google Scholar 

  • Landau GM, Ziv-Ukelson M (2001) On the common substring alignment problem. J Algorithms 41(2):339–354

    Article  MathSciNet  Google Scholar 

  • Liu X, Wang L (2006) Finding the region of pseudo-periodic tandem repeats in biological sequences. Algorithms Mol Biol 1(1):2

    Article  Google Scholar 

  • Liu M, Walch H, Wu S, Grigoriev A (2005) Significant expansion of exon-bordering protein domains during animal proteome evolution. Nucleic Acids Res 33(1):95–105

    Article  Google Scholar 

  • Lopresti D, Tomkins A (1997) Block edit models for approximate string matching. Theor Comput Sci 181(1):159–179

    Article  MATH  MathSciNet  Google Scholar 

  • Maes M (1990) On a cyclic string-to-string correction problem. Inf Process Lett 35(2):73–78

    Article  MATH  MathSciNet  Google Scholar 

  • Marcotte EM, Pellegrini M, Yeates TO, Eisenberg D (1999) A census of protein repeats. J Mol Biol 293(1):151–160

    Article  Google Scholar 

  • Monge G (1781) Déblai et remblai. Mémoires de l’Académie Royale des Sciences

  • Myers EW (1991) An overview of sequence comparison algorithms in molecular biology. Technical report 91-29, Univ of Arizona, Dept of Computer Science

  • Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453

    Article  Google Scholar 

  • Schmidt JP (1998) All highest scoring paths in weighted grid graphs and their application to finding all approximate repeats in strings. SIAM J Comput 27(4):972–992

    Article  MATH  MathSciNet  Google Scholar 

  • Schoeninger M, Waterman MS (1992) A local algorithm for dna sequence alignment with inversions. Bull Math Biol 54(4):521–536

    Google Scholar 

  • Shandala T, Gregory SL, Dalton HE, Smallhorn M, Saint R (2004) Citron kinase is an essential effector of the pbl-activated rho signalling pathway in drosophila melanogaster. Development 131(20):5053–5063

    Article  Google Scholar 

  • Shapira D, Storer JA (2002) Edit distance with move operations. In: CPM ’02: Proceedings of the 13th annual symposium on combinatorial pattern matching, London, UK. Springer, Berlin, pp 85–98

    Chapter  Google Scholar 

  • Vellozo AF, Alves CER, do Lago AP (2006) Alignment with non-overlapping inversions in o(n 3)-time. In: WABI, LNCS, vol 4175. Springer, Berlin

    Google Scholar 

  • Vibranovski MD, Sakabe NJ, de Oliveira RS, de Souza SJ (2005) Signs of ancient and modern exon-shuffling are correlated to the distribution of ancient and modern domains along proteins. J Mol Evol 61(3):341–350

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Ledergerber.

Additional information

A preliminary version of this paper appeared in the Proceedings of COCOON 2007, LNCS, vol. 4598, pp. 151–164.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ledergerber, C., Dessimoz, C. Alignments with non-overlapping moves, inversions and tandem duplications in O(n 4) time. J Comb Optim 16, 263–278 (2008). https://doi.org/10.1007/s10878-007-9132-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-007-9132-y

Keywords

Navigation