Advertisement

Finding Nested Common Intervals Efficiently

  • Guillaume Blin
  • Jens Stoye
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5817)

Abstract

In this paper, we study the problem of efficiently finding gene clusters formalized by nested common intervals between two genomes represented either as permutations or as sequences. Considering permutations, we give several algorithms whose running time depends on the size of the actual output rather than the output in the worst case. Indeed, we first provide a straightforward O(n 3) time algorithm for finding all nested common intervals. We reduce this complexity by providing an O(n 2) time algorithm computing an irredundant output. Finally, we show, by providing a third algorithm, that finding only the maximal nested common intervals can be done in linear time. Considering sequences, we provide solutions (modifications of previously defined algorithms and a new algorithm) for different variants of the problem, depending on the treatment one wants to apply to duplicated genes.

Keywords

Gene Cluster Linear Time Time Algorithm Identity Permutation Consecutive Integer 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bergeron, A., Chauve, C., de Montgolfier, F., Raffinot, M.: Computing common intervals of k permutations, with applications to modular decomposition of graphs. SIAM J. Discret. Math. 22(3), 1022–1039 (2008)CrossRefGoogle Scholar
  2. 2.
    Bergeron, A., Corteel, S., Raffinot, M.: The algorithmic of gene teams. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 464–476. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Bergeron, A., Gingras, Y., Chauve, C.: Formal models of gene clusters. In: Mandoiu, I.I., Zelikovsky, A. (eds.) Bioinformatics Algorithms: Techniques and Applications, ch. 8, pp. 177–202. Wiley, Chichester (2008)Google Scholar
  4. 4.
    Bergeron, A., Stoye, J.: On the similarity of sets of permutations and its applications to genome comparison. J. Comp. Biol. 13(7), 1340–1354 (2006)CrossRefGoogle Scholar
  5. 5.
    Blin, G., Chauve, C., Fertin, G., Rizzi, R., Vialette, S.: Comparing genomes with duplications: a computational complexity point of view. ACM/IEEE Trans. Comput. Biol. Bioinf. 14(4), 523–534 (2007)CrossRefGoogle Scholar
  6. 6.
    Böcker, S., Jahn, K., Mixtacki, J., Stoye, J.: Computation of median gene clusters. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 331–345. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Didier, G., Schmidt, T., Stoye, J., Tsur, D.: Character sets of strings. J. Discr. Alg. 5(2), 330–340 (2007)CrossRefGoogle Scholar
  8. 8.
    He, X., Goldwasser, M.H.: Identifying conserved gene clusters in the presence of homology families. J. Comp. Biol. 12(6), 638–656 (2005)CrossRefGoogle Scholar
  9. 9.
    Hoberman, R., Durand, D.: The incompatible desiderata of gene cluster properties. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 73–87. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  10. 10.
    Kurzik-Dumke, U., Zengerle, A.: Identification of a novel Drosophila melanogaster gene, angel, a member of a nested gene cluster at locus 59F4,5. Biochim. Biophys. Acta 1308(3), 177–181 (1996)CrossRefPubMedGoogle Scholar
  11. 11.
    Landau, G.M., Parida, L., Weimann, O.: Using pq trees for comparative genomics. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 128–143. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Ohno, S.: Evolution by gene duplication. Springer, Heidelberg (1970)CrossRefGoogle Scholar
  13. 13.
    Rahmann, S., Klau, G.W.: Integer linear programs for discovering approximate gene clusters. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 298–309. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Schmidt, T., Stoye, J.: Quadratic time algorithms for finding common intervals in two and more sequences. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 347–358. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  15. 15.
    Uno, T., Yagiura, M.: Fast algorithms to enumerate all common intervals of two permutations. Algorithmica 26(2), 290–309 (2000)CrossRefGoogle Scholar
  16. 16.
    Zhang, M., Leong, H.W.: Gene team tree: A compact representation of all gene teams. In: Nelson, C.E., Vialette, S. (eds.) RECOMB-CG 2008. LNCS (LNBI), vol. 5267, pp. 100–112. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Guillaume Blin
    • 1
  • Jens Stoye
    • 2
  1. 1.Université Paris-Est, LIGM - UMR CNRS 8049France
  2. 2.Technische FakultätUniversität BielefeldGermany

Personalised recommendations