Clonality Inference from Single Tumor Samples Using Low Coverage Sequence Data

  • Nilgun Donmez
  • Salem Malikic
  • Alexander W. Wyatt
  • Martin E. Gleave
  • Colin C. Collins
  • S. Cenk Sahinalp
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9649)


Inference of intra-tumor heterogeneity can provide valuable insight into cancer evolution. Somatic mutations detected by sequencing can help estimate the purity of a tumor sample and reconstruct its subclonal composition. While several methods have been developed to infer intra-tumor heterogeneity, the majority of these tools rely on variant allele frequencies as estimated via ultra-deep sequencing from multiple samples of the same tumor. In practice, obtaining sequencing data from a large number of samples per patient is only feasible in a few cancer types such as liquid tumors, or in rare cases involving solid tumors selected for research. We introduce CTPsingle, which aims to infer the subclonal composition using low-coverage sequencing data from a single tumor sample. We show that CTPsingle is able to infer the purity and the clonality of single-sample tumors with high accuracy even restricted to a coverage depth of \(\sim \)30x.


Intra-tumor heterogeneity Cancer progression DNA sequencing 



This project was funded by a Prostate Cancer Canada Movember Team grant and the Terry Fox Research Institute New Frontiers Program to CCC; NSERC Discovery Frontiers Grant on the Cancer Genome Collaboratory, Genome Canada Bioinformatics and Computational Biology Program Grant and NSERC Discovery Grant to SCS; NSERC CREATE (139277) fellowship and Vanier Canada Graduate Scholarship to SM.


  1. 1.
    Buttrey, S.E., et al.: Calling the \({\rm lp}\_{\rm solve}\) linear program software from r, s-plus and excel. J. Stat. Softw. 14(4), 1–13 (2005)CrossRefGoogle Scholar
  2. 2.
    Deshwar, A.G., Vembu, S., Yung, C.K., Jang, G.H., Stein, L., Morris, Q.: Phylowgs: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16(1), 35 (2015)CrossRefGoogle Scholar
  3. 3.
    El-Kebir, M., Oesper, L., Acheson-Field, H., Raphael, B.J.: Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics 31(12), i62–i70 (2015)CrossRefGoogle Scholar
  4. 4.
    Gerlinger, M., Rowan, A.J., Horswell, S., Larkin, J., Endesfelder, D., Gronroos, E., Martinez, P., Matthews, N., Stewart, A., Tarpey, P., et al.: Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 366(10), 883–892 (2012)CrossRefGoogle Scholar
  5. 5.
    Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L.M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., et al.: Titan: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 24(11), 1881–1893 (2014)CrossRefGoogle Scholar
  6. 6.
    Hajirasouliha, I., Mahmoody, A., Raphael, B.J.: A combinatorial approach foranalyzing intra-tumor heterogeneity from high-throughput sequencing data. Bioinformatics 30(12), i78–i86 (2014). Oxford Univ PressCrossRefGoogle Scholar
  7. 7.
    Jara, A., Hanson, T., Quintana, F., Müller, P., Rosner, G.: DPpackage: Bayesian semi- and nonparametric modeling in R. J. Stat. Softw. 40(5), 1–30 (2011). CrossRefGoogle Scholar
  8. 8.
    Jiao, W., Vembu, S., Deshwar, A., Stein, L., Morris, Q.: Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinform. 15(1), 35 (2014)CrossRefGoogle Scholar
  9. 9.
    Khursheed, M., Kolla, J., Kotapalli, V., Gupta, N., Gowrishankar, S., Uppin, S., Sastry, R., Koganti, S., Sundaram, C., Pollack, J., et al.: ARID1B, a member of the human SWI/SNF chromatin remodeling complex, exhibits tumour-suppressor activities in pancreatic cancer cell lines. Br. J. Cancer 108(10), 2056–2062 (2013)CrossRefGoogle Scholar
  10. 10.
    Klose, R.J., Yan, Q., Tothova, Z., Yamane, K., Erdjument-Bromage, H., Tempst, P., Gilliland, D.G., Zhang, Y., Kaelin, W.G.: The retinoblastoma binding protein RBP2 is an H3K4 demethylase. Cell 128(5), 889–900 (2007)CrossRefGoogle Scholar
  11. 11.
    MacEachern, S.N.: Computational methods for mixture of dirichlet process models. In: Dey, D., Müller, P., Sinha, D. (eds.) Practical Nonparametric and Semiparametric Bayesian Statistics, vol. 133, pp. 23–43. Springer, New York (1998)CrossRefGoogle Scholar
  12. 12.
    Malikic, S., McPherson, A.W., Donmez, N., Sahinalp, C.S.: Clonality inference in multiple tumor samples using phylogeny. Bioinformatics 31(9), 1349–1356 (2015)CrossRefGoogle Scholar
  13. 13.
    Oesper, L., Satas, G., Raphael, B.J.: Quantifying tumor heterogeneity in whole-genome and whole-exome sequencing data. Bioinformatics 30(24), 3532–3540 (2014)CrossRefGoogle Scholar
  14. 14.
    Popic, V., Salari, R., Hajirasouliha, I., Kashef-Haghighi, D., West, R.B., Batzoglou, S.: Fast and scalable inference of multi-sample cancer lineages. Genome Biol. 16(1), 91 (2015)CrossRefGoogle Scholar
  15. 15.
    Prandi, D., Baca, S.C., Romanel, A., Barbieri, C.E., Mosquera, J.M., Fontugne, J., Beltran, H., Sboner, A., Garraway, L.A., Rubin, M.A., et al.: Unraveling the clonal hierarchy of somatic genomic aberrations. Genome Biol. 15(8), 439 (2014)CrossRefGoogle Scholar
  16. 16.
    Roth, A., Khattra, J., Yap, D., Wan, A., Laks, E., Biele, J., Ha, G., Aparicio, S., Bouchard-Côté, A., Shah, S.P.: PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11(4), 396–398 (2014)CrossRefGoogle Scholar
  17. 17.
    Schuh, A., Becq, J., Humphray, S., Alexa, A., Burns, A., Clifford, R., Feller, S.M., Grocock, R., Henderson, S., Khrebtukova, I., Kingsbury, Z., Luo, S., McBride, D., Murray, L., Menju, T., Timbs, A., Ross, M., Taylor, J., Bentley, D.: Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns. Blood 120(20), 4191–4196 (2012)CrossRefGoogle Scholar
  18. 18.
    Sengupta, S., Wang, J., Lee, J., Müller, P., Gulukota, K., Banerjee, A., Ji, Y.: Bayclone: Bayesian nonparametric inference of tumor subclones using NGS data. In: Pacific Symposium on Biocomputing, vol. 20, p. 467. World Scientific (2015)Google Scholar
  19. 19.
    Strino, F., Parisi, F., Micsinai, M., Kluger, Y.: TrAp: a tree approach for fingerprinting subclonal tumor composition. Nucleic Acids Res. 41(17), e165 (2013). Oxford Univ PressCrossRefGoogle Scholar
  20. 20.
    Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.R.M., Ozenberger, B.A., Ellrott, K., Shmulevich, I., Sander, C., Stuart, J.M., Network, C.G.A.R., et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)CrossRefGoogle Scholar
  21. 21.
    Zare, H., Wang, J., Hu, A., Weber, K., Smith, J., Nickerson, D., Song, C., Witten, D., Blau, C.A., Noble, W.S.: Inferring clonal composition from multiple sections of a breast cancer. PLoS Comput. Biol. 10(7), e1003703 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Nilgun Donmez
    • 1
    • 2
  • Salem Malikic
    • 1
    • 2
  • Alexander W. Wyatt
    • 2
    • 3
  • Martin E. Gleave
    • 2
  • Colin C. Collins
    • 2
    • 3
  • S. Cenk Sahinalp
    • 1
    • 2
    • 4
  1. 1.School of Computing ScienceSimon Fraser UniversityBurnabyCanada
  2. 2.Vancouver Prostate CentreVancouverCanada
  3. 3.Department of Urologic SciencesUniversity of British ColumbiaVancouverCanada
  4. 4.School of Informatics and ComputingIndiana UniversityBloomingtonUSA

Personalised recommendations