Computing and Combinatorics

Volume 3595 of the series Lecture Notes in Computer Science pp 63-73

Quartet-Based Phylogeny Reconstruction from Gene Orders

  • Tao LiuAffiliated withDepartment of Computer Science, U. of New Mexico
  • , Jijun TangAffiliated withDepartment of Computer Science & Engineering, U. of South Carolina
  • , Bernard M. E. MoretAffiliated withDepartment of Computer Science, U. of New Mexico

* Final gross prices may vary according to local VAT.

Get Access


Phylogenetic reconstruction from gene-rearrangement data is attracting increasing attention from biologists and computer scientists. Methods used in reconstruction include distance-based methods, parsimony methods using sequence encodings, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach; however, its exhaustive approach means that it can be applied only to small datasets of fewer than 15 taxa. While we have successfully scaled it up to 1,000 genomes by integrating it with a disk-covering method (DCM-GRAPPA), the recursive decomposition may need many levels of recursion to handle datasets with 1,000 or more genomes. We thus investigated quartet-based approaches, which directly decompose the datasets into subsets of four taxa each; such approaches have been well studied for sequence data, but not for gene-rearrangement data. We give an optimization algorithm for the NP-hard problem of computing optimal trees for each quartet, present a variation of the dyadic method (using heuristics to choose suitable short quartets), and use both in simulation studies. We find that our quartet-based method can handle more genomes than the base version of GRAPPA, thus enabling us to reduce the number of levels of recursion in DCM-GRAPPA, but is more sensitive to the rate of evolution, with error rates rapidly increasing when saturation is approached.