Skip to main content

Regular Language Constrained Sequence Alignment Revisited

  • Conference paper
Combinatorial Algorithms (IWOCA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6460))

Included in the following conference series:

  • 688 Accesses

Abstract

Imposing constraints in the form of a finite automaton or a regular expression is an effective way to incorporate additional a priori knowledge into sequence alignment procedures. With this motivation, Arslan [1] introduced the Regular Language Constrained Sequence Alignment Problem and proposed an O(n 2 t 4) time and O(n 2 t 2) space algorithm for solving it, where n is the length of the input strings and t is the number of states in the non-deterministic automaton, which is given as input. Chung et al. [2] proposed a faster O(n 2 t 3) time algorithm for the same problem. In this paper, we further speed up the algorithms for Regular Language Constrained Sequence Alignment by reducing their worst case time complexity bound to O(n 2 t 3/logt). This is done by establishing an optimal bound on the size of Straight-Line Programs solving the maxima computation subproblem of the basic dynamic programming algorithm. We also study another solution based on a Steiner Tree computation. While it does not improve the run time complexity in the worst case, our simulations show that both approaches are efficient in practice, especially when the input automata are dense.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arslan, A.: Regular expression constrained sequence alignment. Journal of Discrete Algorithms 5(4), 647–661 (2007)

    Article  MATH  Google Scholar 

  2. Chung, Y., Lu, C., Tang, C.: Efficient algorithms for regular expression constrained sequence alignment. Information Processing Letters 103(6), 240–246 (2007)

    Article  MATH  Google Scholar 

  3. Smith, T., Waterman, M.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)

    Article  Google Scholar 

  4. Arslan, A., Egecioglu, O.: Algorithms for the constrained longest common subsequence problems. International Journal of Foundations of Computer Science 16(6), 1099–1110 (2005)

    Article  MATH  Google Scholar 

  5. Chen, Y., Chao, K.: On the generalized constrained longest common subsequence problems. Journal of Combinatorial Optimization, 1–10 (2009)

    Google Scholar 

  6. Iliopoulos, C., Rahman, M.: New efficient algorithms for the LCS and constrained LCS problems. Information Processing Letters 106(1), 13–18 (2008)

    Article  MATH  Google Scholar 

  7. Peng, Z., Ting, H.: Time and space efficient algorithms for constrained sequence alignment. In: Domaratzki, M., Okhotin, A., Salomaa, K., Yu, S. (eds.) CIAA 2004. LNCS, vol. 3317, pp. 237–246. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Tsai, Y.: The constrained longest common subsequence problem. Information Processing Letters 88(4), 173–176 (2003)

    Article  MATH  Google Scholar 

  9. Bairoch, A.: The PROSITE dictionary of sites and patterns in proteins, its current status. Nucleic Acids Research 21(13), 3097 (1993)

    Article  Google Scholar 

  10. Tang, C., Lu, C., Chang, M., Tsai, Y., Sun, Y., Chao, K., Chang, J., Chiou, Y., Wu, C., Chang, H., et al.: Constrained multiple sequence alignment tool development and its application to RNase family alignment. Journal of Bioinformatics and Computational Biology 1(2), 267–287 (2003)

    Article  Google Scholar 

  11. Bern, M., Plassmann, P.: The Steiner problem with edge lengths 1 and 2. Information Processing Letters 32(4), 171–176 (1989)

    Article  MATH  Google Scholar 

  12. Shi, W., Su, C.: The rectilinear Steiner arborescence problem is NP-complete. SIAM Journal on Computing 35(3), 729–740 (2006)

    Article  MATH  Google Scholar 

  13. Foulds, L., Graham, R.: The Steiner problem in phylogeny is NP-complete. Advances in Applied Mathematics 3(43-49), 299 (1982)

    MATH  Google Scholar 

  14. Jia, W., Han, B., Au, P., He, Y., Zhou, W.: Optimal multicast tree routing for cluster computing in hypercube interconnection networks. IEICE Transactions on Information and Systems E87-D, 1625–1632 (2004)

    Google Scholar 

  15. Lin, X., Ni, L.: Multicast communication in multicomputer networks. IEEE Transactions on Parallel and Distributed Systems 4(10), 1105–1117 (1993)

    Article  Google Scholar 

  16. Sheu, S., Yang, C.: Multicast algorithms for hypercube multiprocessors. Journal of Parallel and Distributed Computing 61(1), 137–149 (2001)

    Article  MATH  Google Scholar 

  17. Dinur, I., Safra, S.: On the hardness of approximating minimum vertex cover. Annals of Mathematics 162(1), 439–486 (2005)

    Article  MATH  Google Scholar 

  18. Sylvester, J.: Thoughts on inverse orthogonal matrices simultaneous sign successions, and tessellated pavements in two or more colors, with applications to Newton’s rule, ornamental tile-work and the theory of numbers. Phil. Mag. 34(2), 461–475 (1867)

    Google Scholar 

  19. Seberry, J., Yamada, M.: Hadamard matrices, sequences, and block designs. Contemporary Design Theory: A Collection of Surveys, 431–560 (1992)

    Google Scholar 

  20. Savage, J.: An algorithm for the computation of linear forms. SIAM J. Comput. 3(2), 150–158 (1974)

    Article  MATH  Google Scholar 

  21. Hromkoviěc, J., Seibert, S., Wilke, T.: Translating regular expressions into small ε-free nondeterministic finite automata. Journal of Computer and System Sciences 62(4), 565–588 (2001)

    Article  MATH  Google Scholar 

  22. Schnitger, G.: Regular expressions and NFAs without epsilon-transitions. In: Durand, B., Thomas, W. (eds.) STACS 2006. LNCS, vol. 3884, p. 432. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  23. Geffert, V.: Translation of binary regular expressions into nondeterministic ε-free automata with O(n logn) transitions. Journal of Computer and System Sciences 66(3), 451–472 (2003)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kucherov, G., Pinhas, T., Ziv-Ukelson, M. (2011). Regular Language Constrained Sequence Alignment Revisited. In: Iliopoulos, C.S., Smyth, W.F. (eds) Combinatorial Algorithms. IWOCA 2010. Lecture Notes in Computer Science, vol 6460. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19222-7_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19222-7_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19221-0

  • Online ISBN: 978-3-642-19222-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics