Skip to main content

Fast Computation of a String Duplication History under No-Breakpoint-Reuse

(Extended Abstract)

  • Conference paper
  • 726 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7024))

Abstract

In this paper, we provide an O(n log2 n loglognlog* n) algorithm to compute a duplication history of a string under no-breakpoint-reuse condition. Our algorithm is an efficient implementation of earlier work by Zhang et al. (2009). The motivation of this problem stems from computational biology, in particular from analysis of complex gene clusters. The problem is also related to computing edit distance with block operations, but in our scenario the start of the history is not fixed, but chosen to minimize the distance measure.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Alstrup, S., Brodal, G.S., Rauhe, T.: Pattern matching in dynamic texts. In: Symposium on Discrete Algorithms (SODA), pp. 819–828 (2000)

    Google Scholar 

  • Ann, H.-Y., Yang, C.-B., Peng, Y.-H., Liaw, B.-C.: Efficient algorithms for the block edit problems. Information and Computation 208(3), 221–229 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. ACM Transactions on Algorithms 3(1), 1–19 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Elemento, O., Gascuel, O., Lefranc, M.-P.: Reconstructing the duplication history of tandemly repeated genes. Molecular Biology and Evolution 19(3), 278–278 (2002)

    Article  Google Scholar 

  • Ergün, F., Muthukrishnan, S.M., Şahinalp, S.C.: Comparing sequences with segment rearrangements. In: Pandya, P.K., Radhakrishnan, J. (eds.) FSTTCS 2003. LNCS, vol. 2914, pp. 183–194. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  • Kahn, C.L., Mozes, S., Raphael, B.J.: Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes. Algorithms for Molecular Biology 5(1), 11 (2010)

    Article  Google Scholar 

  • Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. Journal of the ACM 53(6), 918–936 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Kasai, T., Lee, G.H., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  • Lajoie, M., Bertrand, D., El-Mabrouk, N., Gascuel, O.: Duplication and inversion history of a tandemly repeated genes family. Journal of Computational Biology 14(4), 462–468 (2007)

    Article  MathSciNet  Google Scholar 

  • Lopresti, D.P., Tomkins, A.: Block edit models for approximate string matching. Theoretical Computer Science 181(1), 159–179 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Ma, J., Ratan, A., Raney, B.J., Suh, B.B., Miller, W., Haussler, D.: The infinite sites model of genome evolution. Proceeding of the National Academy of Sciences USA 105(38), 14254–14261 (2008)

    Article  Google Scholar 

  • Mehlhorn, K., Näher, S.: Dynamic fractional cascading. Algorithmica 5(2), 215–241 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  • Nadeau, J.H., Taylor, B.A.: Lengths of chromosomal segments conserved since divergence of man and mouse. Proceeding of the National Academy of Sciences USA 81(3), 814–818 (1984)

    Article  Google Scholar 

  • Shapira, D., Storer, J.A.: Large edit distance with multiple block operations. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 369–377. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  • Shapira, D., Storer, J.A.: Edit distance with move operations. Journal of Discrete Algorithms 5(2), 380–392 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Song, G., Zhang, L., Vinar, T., Miller, W.: CAGE: Combinatorial Analysis of Gene-cluster Evolution. Journal of Computational Biology 17(9), 1227–1232 (2010)

    Article  MathSciNet  Google Scholar 

  • Vinar, T., Brejova, B., Song, G., Siepel, A.: Reconstructing histories of complex gene clusters on a phylogeny. Journal of Computational Biology 17(9), 1267–1269 (2010)

    Article  MathSciNet  Google Scholar 

  • Zhang, Y., Song, G., Vinar, T., Green, E.D., Siepel, A., Miller, W.: Evolutionary history reconstruction for mammalian complex gene clusters. Journal of Computational Biology 16(8), 1051–1060 (2009)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brejová, B., Landau, G.M., Vinař, T. (2011). Fast Computation of a String Duplication History under No-Breakpoint-Reuse. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds) String Processing and Information Retrieval. SPIRE 2011. Lecture Notes in Computer Science, vol 7024. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24583-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24583-1_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24582-4

  • Online ISBN: 978-3-642-24583-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics