Finding a Perfect Phylogeny from Mixed Tumor Samples

  • Ademir Hujdurović
  • Urša Kačar
  • Martin MilaničEmail author
  • Bernard Ries
  • Alexandru I. Tomescu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9289)


Recently, Hajirasouliha and Raphael (WABI 2014) proposed a model for deconvoluting mixed tumor samples measured from a collection of high-throughput sequencing reads. This is related to understanding tumor evolution and critical cancer mutations. In short, their formulation asks to split each row of a binary matrix so that the resulting matrix corresponds to a perfect phylogeny and has the minimum number of rows among all matrices with this property. In this paper we disprove several claims about this problem, including an NP-hardness proof of it. However, we show that the problem is indeed NP-hard, by providing a different proof. We also prove NP-completeness of a variant of this problem proposed in the same paper. On the positive side, we obtain a polynomial time algorithm for matrix instances in which no column is contained in both columns of a pair of conflicting columns.


Polynomial Time Algorithm Chromatic Number Binary Matrix Orientable Graph Conflict Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bafna, V., et al.: A note on efficient computation of haplotypes via perfect phylogeny. J. Comput. Biol. 11(5), 858–866 (2004). CrossRefGoogle Scholar
  2. 2.
    Campbell, P.J., et al.: Subclonal phylogenetic structures in cancer revealed by ultra-deep sequencing. Proc. Natl. Acad. Sci. 105(35), 13081–13086 (2008). CrossRefGoogle Scholar
  3. 3.
    Estabrook, G.F., et al.: An idealized concept of the true cladistic character. Math. Biosci. 23(3–4), 263–272 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Golumbic, M.C.: Algorithmic Graph Theory and Perfect Graphs. Annals of Discrete Mathematics, vol. 57, 2nd edn. Elsevier Science BV, Amsterdam (2004)Google Scholar
  5. 5.
    Gusfield, D.: Efficient algorithms for inferring evolutionary trees. Networks 21(1), 19–28 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997)CrossRefzbMATHGoogle Scholar
  7. 7.
    Ha, G., et al.: Titan: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 24(11), 1881–1893 (2014). CrossRefGoogle Scholar
  8. 8.
    Hajirasouliha, I., Raphael, B.J.: Reconstructing mutational history in multiply sampled tumors using perfect phylogeny mixtures. In: Brown, D., Morgenstern, B. (eds.) WABI 2014. LNCS, vol. 8701, pp. 354–367. Springer, Heidelberg (2014). Google Scholar
  9. 9.
    Holyer, I.: The NP-completeness of edge-coloring. SIAM J. Comput. 10(4), 718–720 (1981). MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Isaacs, R.: Infinite families of nontrivial trivalent graphs which are not Tait colorable. Amer. Math. Monthly 82, 221–239 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Jiao, W., et al.: Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinform. 15, 35 (2014)CrossRefGoogle Scholar
  12. 12.
    Kačar, U.: Problemi popolne filogenije (Perfect Phylogeny Problems). Final project paper. University of Primorska, Faculty of Mathematics, Natural Sciences and Information Technologies, Koper, Slovenia (2015).
  13. 13.
    Koboldt, D.C., et al.: VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012). CrossRefGoogle Scholar
  14. 14.
    Li, Y., Xie, X.: Mixclone: a mixture model for inferring tumor subclonal populations. BMC Genomics 16(S–2), S1 (2015). CrossRefGoogle Scholar
  15. 15.
    Miller, C.A., et al.: SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution. PLoS Comput. Biol. 10(8), e1003665+ (2014). CrossRefGoogle Scholar
  16. 16.
    Newburger, D.E., et al.: Genome evolution during progression to breast cancer. Genome Res. 23(7), 1097–1108 (2013). CrossRefGoogle Scholar
  17. 17.
    Nik-Zainal, S., et al.: The life history of 21 breast cancers. Cell 149(5), 994–1007 (2012). CrossRefGoogle Scholar
  18. 18.
    Oesper, L., et al.: THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 14(7), R80 (2013). CrossRefGoogle Scholar
  19. 19.
    van Rens, K.E., et al.: SNV-PPILP: refined SNV calling for tumor data using perfect phylogenies and ILP. Bioinformatics 31(7), 1133–1135 (2015).
  20. 20.
    Roth, A., et al.: PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11(4), 396–398 (2014). CrossRefGoogle Scholar
  21. 21.
    Salari, R., et al.: Inference of tumor phylogenies with improved somatic mutation discovery. J. Comput. Biol. 20(11), 933–944 (2013). MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Ademir Hujdurović
    • 1
    • 2
  • Urša Kačar
    • 2
  • Martin Milanič
    • 1
    • 2
    Email author
  • Bernard Ries
    • 3
    • 4
  • Alexandru I. Tomescu
    • 5
  1. 1.UP IAMUniversity of PrimorskaKoperSlovenia
  2. 2.UP FAMNITUniversity of PrimorskaKoperSlovenia
  3. 3.PSL, Université Paris-DauphineParisFrance
  4. 4.CNRS, LAMSADE UMR 7243ParisFrance
  5. 5.Helsinki Institute for Information Technology HIIT, Department of Computer ScienceUniversity of HelsinkiHelsinkiFinland

Personalised recommendations