Abstract
There is widespread interest in comparative genomics in determining if historically and/or functionally related genes are spatially clustered in the genome, and whether the same sets of genes reappear in clusters in two or more genomes. We formalize and analyze the desirable properties of gene clusters and cluster definitions. Through detailed analysis of two commonly applied types of cluster, r-windows and max-gap, we investigate the extent to which a single definition can embody all of these properties simultaneously. We show that many of the most important properties are difficult to satisfy within the same definition. We also examine whether one commonly assumed property, which we call nestedness, is satisfied by the structures present in real genomic data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Murphy, W.J., Pevzner, P.A., O’Brien, S.J.: Mammalian phylogenomics comes of age. Trends Genet. 20, 631–639 (2004)
O’Brien, S.J., Menotti-Raymond, M., Murphy, W.J., Nash, W.G., Wienberg, J., Stanyon, R., Copeland, N.G., Jenkins, N.A., Womack, J.E., Graves, J.A.M.: The promise of comparative genomics in mammals. Science 286, 458–481 (1999)
Sankoff, D.: Rearrangements and chromosomal evolution. Curr. Opin. Genet. Dev. 13, 583–587 (2003)
Sankoff, D., Nadeau, J.H.: Chromosome rearrangements in evolution: From gene order to genome sequence and back. PNAS 100, 11188–11189 (2003)
Simillion, C., Vandepoele, K., de Peer, Y.V.: Recent developments in computational approaches for uncovering genomic homology. Bioessays 26, 1225–1235 (2004)
Blanc, G., Hokamp, K., Wolfe, K.H.: A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. Genome. Res. 13, 137–144 (2003)
Chen, X., Su, Z., Dam, P., Palenik, B., Xu, Y., Jiang, T.: Operon prediction by comparative genomics: an application to the Synechococcus sp. WH8102 genome. Nucleic Acids Res. 32, 2147–2157 (2004)
Lawrence, J., Roth, J.R.: Selfish operons: horizontal transfer drive the evolution of gene clusters. Genetics 143, 1843–1860 (1996)
Overbeek, R., Fonstein, M., D’Souza, M., Pusch, G.D., Maltsev, N.: The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. U. S. A. 96, 2896–2901 (1999)
Tamames, J.: Evolution of gene order conservation in prokaryotes. Genome. Biol. 6, 0020.1–0020.11 (2001)
Wolf, Y.I., Rogozin, I.B., Kondrashov, A.S., Koonin, E.V.: Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context. Genome. Res. 11, 356–372 (2001)
Endo, T., Imanishi, T., Gojobori, T., Inoko, H.: Evolutionary significance of intra-genome duplications on human chromosomes. Gene. 205, 19–27 (1997)
Smith, N.G.C., Knight, R., Hurst, L.D.: Vertebrate genome evolution: a slow shuffle or a big bang. BioEssays 21, 697–703 (1999)
Trachtulec, Z., Forejt, J.: Synteny of orthologous genes conserved in mammals, snake, fly, nematode, and fission yeast. Mamm. Genome. 3, 227–231 (2001)
Friedman, R., Hughes, A.L.: Gene duplication and the structure of eukaryotic genomes. Genome. Res. 11, 373–381 (2001)
Luc, N., Risler, J., Bergeron, A., Raffinot, M.: Gene teams: a new formalization of gene clusters for comparative genomics. Comput. Biol. Chem. 27, 59–67 (2003)
McLysaght, A., Hokamp, K., Wolfe, K.H.: Extensive genomic duplication during early chordate evolution. Nat. Genet. 31, 200–204 (2002)
Cavalcanti, A.R.O., Ferreira, R., Gu, Z., Li, W.H.: Patterns of gene duplication in Saccharomyces cerevisiae and Caenorhabditis elegans. J. Mol. Evol. 56, 28–37 (2003)
Durand, D., Sankoff, D.: Tests for gene clustering. Journal of Computational Biology, 453–482 (2003)
Bergeron, A., Corteel, S., Raffinot, M.: The algorithmic of gene teams. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 464–476. Springer, Heidelberg (2002)
Hoberman, R., Sankoff, D., Durand, D.: The statistical significance of max-gap clusters. In: Lagergren, J. (ed.) RECOMB-WS 2004. LNCS (LNBI), vol. 3388. Springer, Heidelberg (2005)
Didier, G.: Common intervals of two sequences. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 17–24. Springer, Heidelberg (2003)
Heber, S., Stoye, J.: Algorithms for finding gene clusters. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 254–265. Springer, Heidelberg (2001)
Uno, T., Yagiura, M.: Fast algorithms to enumerate all common intervals of two permutations. Algorithmica 26, 290–309 (2000)
Calabrese, P.P., Chakravarty, S., Vision, T.J.: Fast identification and statistical evaluation of segmental homologies in comparative maps. ISMB (Supplement of Bioinformatics), 74–80 (2003)
Sankoff, D., Ferretti, V., Nadeau, J.H.: Conserved segment identification. Journal of Computational Biology 4, 559–565 (1997)
Pevzner, P., Tesler, G.: Genome rearrangements in mammalian evolution: lessons from human and mouse genomes. Genome. Res. 13, 37–45 (2003)
Haas, B.J., Delcher, A.L., Wortman, J.R., Salzberg, S.L.: DAGchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics 20, 3643–3646 (2004)
Vision, T.J., Brown, D.G., Tanksley, S.D.: The origins of genomic duplications in Arabidopsis. Science 290, 2114–2117 (2000)
Bansal, A.K.: An automated comparative analysis of 17 complete microbial genomes. Bioinformatics 15, 900–908 (1999), http://www.cs.kent.edu/~arvind/orthos.html
Cannon, S.B., Kozik, A., Chan, B., Michelmore, R., Young, N.D.: DiagHunter and GenoPix2D: programs for genomic comparisons, large-scale homology discovery and visualization. Genome. Biol. 4, R68 (2003)
Hampson, S., McLysaght, A., Gaut, B., Baldi, P.: LineUp: statistical detection of chromosomal homology with application to plant comparative genomics. Genome. Res. 13, 999–1010 (2003)
Hampson, S.E., Gaut, B.S., Baldi, P.: Statistical detection of chromosomal homology using shared-gene density alone. Bioinformatics 21, 1339–1348 (2005)
Vandepoele, K., Saeys, Y., Simillion, C., Raes, J., Peer, Y.V.D.: The automatic detection of homologous regions (ADHoRe) and its application to microcolinearity between Arabidopsis and rice. Genome. Res. 12, 1792–1801 (2002)
Raes, J., Vandepoele, K., Simillion, C., Saeys, Y., de Peer, Y.V.: Investigating ancient duplication events in the Arabidopsis genome. J. Struct. Funct. Genomics 3, 117–129 (2003)
Graur, D., Martin, W.: Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. Trends Genet. 20, 80–86 (2004)
Nei, M., Kumar, S.: Molecular Evolution and Phylogenetics. Oxford University Press, Oxford (2000)
Zhang, L., Vision, T.J., Gaut, B.S.: Patterns of nucleotide substitution among simultaneously duplicated gene pairs in Arabidopsis thaliana. Mol. Biol. Evol. 19, 1464–1473 (2002)
Hokamp, K.: A Bioinformatics Approach to (Intra-)Genome Comparisons. PhD thesis, University of Dublin, Trinity College (2001)
Bourque, G., Zdobnov, E., Bork, P., Pevzner, P., Telser, G.: Genome rearrangements in human, mouse, rat and chicken. Genome. Research (2004)
Simillion, C., Vandepoele, K., Montagu, M.V., Zabeau, M., de Peer, Y.V.: The hidden duplication past of Arabidopsis thaliana. PNAS 99, 13627–13632 (2002)
O’Brien, K.P., Remm, M., Sonnhammer, E.L.L.: Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res. 33, D476–D480 (2005) Version 4.0, downloaded (May 2005)
Lynch, M., Conery, J.S.: The evolutionary fate and consequences of duplicate genes. Science 290, 1151–1155 (2000)
Trinh, P., McLysaght, A., Sankoff, D.: Genomic features in the breakpoint regions between syntenic blocks. Bioinformatics 20(suppl. 1), I318–I325 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hoberman, R., Durand, D. (2005). The Incompatible Desiderata of Gene Cluster Properties. In: McLysaght, A., Huson, D.H. (eds) Comparative Genomics. RCG 2005. Lecture Notes in Computer Science(), vol 3678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11554714_7
Download citation
DOI: https://doi.org/10.1007/11554714_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28932-6
Online ISBN: 978-3-540-31814-9
eBook Packages: Computer ScienceComputer Science (R0)