The Copy-Number Tree Mixture Deconvolution Problem and Applications to Multi-sample Bulk Sequencing Tumor Data

  • Simone Zaccaria
  • Mohammed El-Kebir
  • Gunnar W. Klau
  • Benjamin J. RaphaelEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10229)


Cancer is an evolutionary process driven by somatic mutation. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the mutational complexity of cancer and the fact that nearly all cancer sequencing is of bulk tissue, measuring a superposition of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy number data from multiple samples of a tumor. CNTMD generalizes two approaches that have been researched intensively in recent years: deconvolution/factorization algorithms that aim to infer the number and proportions of clones in a mixed tumor sample; and phylogenetic models of copy number evolution that model the dependencies between copy number events that affect the same genomic loci. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that perform either deconvolution or phylogenetic tree construction under the assumption of a single tumor clone per sample. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher-resolution view of copy number evolution of this cancer than published analyses.


Integer Linear Programming Copy Number Aberration Integer Linear Programming Formulation Interval Event Tumor Clone 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work is supported by a US National Science Foundation (NSF) CAREER Award (CCF-1053753) and US National Institutes of Health (NIH) grants R01HG005690 and R01HG007069 to BJR. BJR is supported by a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, an Alfred P. Sloan Research Fellowship.


  1. 1.
    Baumbusch, L.O., et al.: Comparison of the agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors. BMC Genom. 9(1), 379 (2008)CrossRefGoogle Scholar
  2. 2.
    Carter, S.L., et al.: Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30(5), 413–421 (2012)CrossRefGoogle Scholar
  3. 3.
    Chowdhury, S.A., et al.: Algorithms to model single gene, single chromosome, and whole genome copy number changes jointly in tumor phylogenetics. PLoS Comput. Biol. 10(7), 1–19 (2014)CrossRefGoogle Scholar
  4. 4.
    Davis, A., et al.: Computing tumor trees from single cells. Genome Biol. 17, 1 (2016)CrossRefGoogle Scholar
  5. 5.
    Deshwar, A.G., et al.: PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16(1), 1 (2015)CrossRefGoogle Scholar
  6. 6.
    El-Kebir, M., et al.: Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics 31(12), i62–i70 (2015)CrossRefGoogle Scholar
  7. 7.
    El-Kebir, M., Raphael, B.J., Shamir, R., Sharan, R., Zaccaria, S., Zehavi, M., Zeira, R.: Copy-number evolution problems: complexity and algorithms. In: Frith, M., Storm Pedersen, C.N. (eds.) WABI 2016. LNCS, vol. 9838, pp. 137–149. Springer, Heidelberg (2016). doi: 10.1007/978-3-319-43681-4_11 CrossRefGoogle Scholar
  8. 8.
    El-Kebir, M., et al.: Inferring the mutational history of a tumor using multi-state perfect phylogeny mixtures. Cell Syst. 3(1), 43–53 (2016)CrossRefGoogle Scholar
  9. 9.
    Fischer, A., et al.: High-definition reconstruction of clonal composition in cancer. Cell Rep. 7(5), 1740–1752 (2014)CrossRefGoogle Scholar
  10. 10.
    Gavin, H., et al.: Titan: inference of copy number architectures in clonal cell populations from tumor whole genome sequence data. Genome Res. 24, 1881–1893 (2014)CrossRefGoogle Scholar
  11. 11.
    Gawad, C., et al.: Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17(3), 175–188 (2016)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Gerlinger, M., et al.: Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 366(10), 883–892 (2012)CrossRefGoogle Scholar
  13. 13.
    Gundem, G., et al.: The evolutionary history of lethal metastatic prostate cancer. Nature 520(7547), 353–357 (2015)CrossRefGoogle Scholar
  14. 14.
    Jiang, Y., et al.: Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. PNAS 113, E5528–E5537 (2016)CrossRefGoogle Scholar
  15. 15.
    Li, Y., et al.: Allele-specific quantification of structural variations in cancer genomes. Cell Syst. 3(1), 21–34 (2016)CrossRefGoogle Scholar
  16. 16.
    Van Loo, P., et al.: Allele-specific copy number analysis of tumors. PNAS 107, 16910–16915 (2010)CrossRefGoogle Scholar
  17. 17.
    Malikic, S., et al.: Clonality inference in multiple tumor samples using phylogeny. Bioinformatics 31(9), 1349–1356 (2015)CrossRefGoogle Scholar
  18. 18.
    McPherson, A., Roth, A., Chauve, C., Sahinalp, S.C.: Joint inference of genome structure and content in heterogeneous tumor samples. In: Przytycka, T.M. (ed.) RECOMB 2015. LNCS, vol. 9029, pp. 256–258. Springer, Cham (2015). doi: 10.1007/978-3-319-16706-0_25 Google Scholar
  19. 19.
    McPherson, A., et al.: Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer. Nat. Genet. 48, 758–767 (2016). doi: 10.1038/ng.3573 CrossRefGoogle Scholar
  20. 20.
    Nik-Zainal, S., et al.: The life history of 21 breast cancers. Cell 149(5), 994–1007 (2012)CrossRefGoogle Scholar
  21. 21.
    Nowell, P.C.: The clonal evolution of tumor cell populations. Science 194, 23–28 (1976)CrossRefGoogle Scholar
  22. 22.
    Oesper, L., et al.: Reconstructing cancer genomes from paired-end sequencing data. BMC Bioinform. 13(6), S10 (2012)CrossRefGoogle Scholar
  23. 23.
    Oesper, L., et al.: THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 14(7), R80 (2013)CrossRefGoogle Scholar
  24. 24.
    Roman, T., et al.: Medoidshift clustering applied to genomic bulk tumor data. BMC Genom. 17(1), 6 (2016)CrossRefGoogle Scholar
  25. 25.
    Schwarz, R.F., et al.: Phylogenetic quantification of intra-tumour heterogeneity. PLoS Comput. Biol. 10(4), 1–11 (2014)CrossRefGoogle Scholar
  26. 26.
    Schwarz, R.F., et al.: Spatial and temporal heterogeneity in high-grade serous ovarian cancer: a phylogenetic analysis. PLoS Med 12(2), 1–20 (2015)CrossRefGoogle Scholar
  27. 27.
    Shamir, R., et al.: A linear-time algorithm for the copy number transformation problem. In: LIPIcs, vol. 54. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)Google Scholar
  28. 28.
    Sottoriva, A., et al.: A Big Bang model of human colorectal tumor growth. Nat. Genet. 47(3), 209–216 (2015)CrossRefGoogle Scholar
  29. 29.
    Venkatesan, S., et al.: Tumor evolutionary principles: How intratumor heterogeneity influences cancer treatment and outcome. ASCO 35, e141–e149 (2015)Google Scholar
  30. 30.
    Zack, T.I., et al.: Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45(10), 1134–1140 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Simone Zaccaria
    • 1
    • 2
  • Mohammed El-Kebir
    • 2
    • 3
  • Gunnar W. Klau
    • 2
    • 4
    • 5
  • Benjamin J. Raphael
    • 2
    • 3
    Email author
  1. 1.Dipartimento di InformaticaUniv. degli Studi di Milano-BicoccaMilanItaly
  2. 2.Department of Computer ScienceBrown UniversityProvidenceUSA
  3. 3.Department of Computer SciencePrinceton UniversityPrincetonUSA
  4. 4.Life Sciences GroupCentrum Wiskunde & Informatica (CWI)AmsterdamThe Netherlands
  5. 5.Algorithmic BioinformaticsHeinrich Heine UniversityDüsseldorfGermany

Personalised recommendations