Abstract
In this paper, we provide an O(n log2 n loglognlog* n) algorithm to compute a duplication history of a string under no-breakpoint-reuse condition. Our algorithm is an efficient implementation of earlier work by Zhang et al. (2009). The motivation of this problem stems from computational biology, in particular from analysis of complex gene clusters. The problem is also related to computing edit distance with block operations, but in our scenario the start of the history is not fixed, but chosen to minimize the distance measure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alstrup, S., Brodal, G.S., Rauhe, T.: Pattern matching in dynamic texts. In: Symposium on Discrete Algorithms (SODA), pp. 819–828 (2000)
Ann, H.-Y., Yang, C.-B., Peng, Y.-H., Liaw, B.-C.: Efficient algorithms for the block edit problems. Information and Computation 208(3), 221–229 (2010)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. ACM Transactions on Algorithms 3(1), 1–19 (2007)
Elemento, O., Gascuel, O., Lefranc, M.-P.: Reconstructing the duplication history of tandemly repeated genes. Molecular Biology and Evolution 19(3), 278–278 (2002)
Ergün, F., Muthukrishnan, S.M., Şahinalp, S.C.: Comparing sequences with segment rearrangements. In: Pandya, P.K., Radhakrishnan, J. (eds.) FSTTCS 2003. LNCS, vol. 2914, pp. 183–194. Springer, Heidelberg (2003)
Kahn, C.L., Mozes, S., Raphael, B.J.: Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes. Algorithms for Molecular Biology 5(1), 11 (2010)
Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. Journal of the ACM 53(6), 918–936 (2006)
Kasai, T., Lee, G.H., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)
Lajoie, M., Bertrand, D., El-Mabrouk, N., Gascuel, O.: Duplication and inversion history of a tandemly repeated genes family. Journal of Computational Biology 14(4), 462–468 (2007)
Lopresti, D.P., Tomkins, A.: Block edit models for approximate string matching. Theoretical Computer Science 181(1), 159–179 (1997)
Ma, J., Ratan, A., Raney, B.J., Suh, B.B., Miller, W., Haussler, D.: The infinite sites model of genome evolution. Proceeding of the National Academy of Sciences USA 105(38), 14254–14261 (2008)
Mehlhorn, K., Näher, S.: Dynamic fractional cascading. Algorithmica 5(2), 215–241 (1990)
Nadeau, J.H., Taylor, B.A.: Lengths of chromosomal segments conserved since divergence of man and mouse. Proceeding of the National Academy of Sciences USA 81(3), 814–818 (1984)
Shapira, D., Storer, J.A.: Large edit distance with multiple block operations. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 369–377. Springer, Heidelberg (2003)
Shapira, D., Storer, J.A.: Edit distance with move operations. Journal of Discrete Algorithms 5(2), 380–392 (2007)
Song, G., Zhang, L., Vinar, T., Miller, W.: CAGE: Combinatorial Analysis of Gene-cluster Evolution. Journal of Computational Biology 17(9), 1227–1232 (2010)
Vinar, T., Brejova, B., Song, G., Siepel, A.: Reconstructing histories of complex gene clusters on a phylogeny. Journal of Computational Biology 17(9), 1267–1269 (2010)
Zhang, Y., Song, G., Vinar, T., Green, E.D., Siepel, A., Miller, W.: Evolutionary history reconstruction for mammalian complex gene clusters. Journal of Computational Biology 16(8), 1051–1060 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brejová, B., Landau, G.M., Vinař, T. (2011). Fast Computation of a String Duplication History under No-Breakpoint-Reuse. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds) String Processing and Information Retrieval. SPIRE 2011. Lecture Notes in Computer Science, vol 7024. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24583-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-24583-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24582-4
Online ISBN: 978-3-642-24583-1
eBook Packages: Computer ScienceComputer Science (R0)