Controlling Size When Aligning Multiple Genomic Sequences with Duplications
For a genomic region containing a tandem gene cluster, a proper set of alignments needs to align only orthologous segments, i.e., those separated by a speciation event. Otherwise, methods for finding regions under evolutionary selection will not perform properly. Conversely, the alignments should indicate every orthologous pair of genes or genomic segments. Attaining this goal in practice requires a technique for avoiding a combinatorial explosion in the number of local alignments. To better understand this process, we model it as a graph problem of finding a minimum cardinality set of cliques that contain all edges. We provide an upper bound for an important class of graphs (the problem is NP-hard and very difficult to approximate in the general case), and use the bound and computer simulations to evaluate two heuristic solutions. An implementation of one of them is evaluated on mammalian sequences from the α-globin gene cluster.
KeywordsPairwise Alignment Orthologous Pair Clique Partition Alignment Graph Clique Cover
Unable to display preview. Download preview PDF.
- 1.Berman, P.: Relationship between density and deterministic complexity of NP-complete languages. In: Ausiello, G., Böhm, C. (eds.) ICALP 1978. LNCS, vol. 62, pp. 63–71. Springer, Heidelberg (1978)Google Scholar
- 3.Cacceta, L., Erdos, P., Ordman, E.T., Pullman, N.J.: On the difference between clique numbers of a graph. Ars Combinatoria 19A, 97–106 (1985)Google Scholar
- 4.Cavers, M.: Clique partitions and coverings of graphs (Masters thesis, University of Waterloo) (2005)Google Scholar
- 8.Gramm, J., et al.: Data reduction, exact, and heuristic algorithms for clique cover. In: ALENEX, pp. 86–94 (2006)Google Scholar
- 10.Hall Jr., M.: A problem in partition. Bull. Amer. Math. Soc. 47, 801–807 (1941)Google Scholar
- 11.Hou, M., et al.: Aligning multiple genomic sequences that contain duplications (manuscript)Google Scholar
- 15.Margulies, E.H., et al.: Relationship between evolutionary constraint and genome function in 1% of the human genome. Nature (submitted)Google Scholar
- 16.Margulies, E.H., et al.: Annotation of the human genome through comparisons of diverse mammalian sequences. Genome Research (submitted)Google Scholar
- 22.The ENCODE Project Consortium: The ENCODE (ENCyclopedia of DNA Elements) Project. Science 306, 636–640 (2004)Google Scholar