Abstract
In recent years, the next generation sequencing enables us to obtain high resolution landscapes of the genetic changes at single-nucleotide level. More and more novel methods are proposed for efficient and effective analyses on cancer sequencing data. To facilitate such development, data simulator is a crucial tool, which not only tests and evaluates proposed approaches, but provides the feedbacks for further improvements as well. Several simulators are released to generate the next generation sequencing data. However, based on our best knowledge, none of them considers clonality information. It is suggested that clonal heterogeneity does widely exist in tumor samples. The patterns of somatic mutational events usually expose a wide spectrum of variant allelic frequencies, while some of them are only detectable in one or multiple clonal lineages. In this article, we introduce a Tumor-Normal sequencing Simulator, TNSim, to generate the next generation sequencing data by involving clonality information. The simulator is able to mimic a tumor sample and the paired normal sample, where the germline variants and somatic mutations can be settled respectively. Tumor purity is adjustable. Clonal architecture is preassigned as one or more clonal lineages, where each lineage consists of a set of somatic mutations whose variant allelic frequencies are similar. A group of experiments are conducted to evaluate its performance. The statistical features of the artificial sequencing reads are comparable to the real tumor sequencing data whose sample consists of multiple sub-clones. The source codes are available at http://github.com/lnmxgy/TNSim and for academic use only.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kandoth, C., McLellan, M., Vandin, F., et al.: Mutational landscape and significance across 12 major cancer types. Nature 502(7471), 333–339 (2013)
Lu, C., Xie, M., Wendl, M., et al.: Patterns and functional implications of rare germline variants across 12 cancer types. Nature Commun. 6, 10086 (2015)
Huang, K., Mashl, R., Wu, Y., et al.: Pathogenic germline variants in 10,389 adult cancers. Cell 173(2), 355–370 (2018)
Ding, L., Raphael, B., Chen, F., et al.: Advances for studying clonal evolution in cancer. Cancer Lett. 340(2), 212–219 (2013)
The Computational Pan-Genomics Consortium: Computational pan-genomics: status, promises and challenges. Briefings Bioinform. 19(1), 118–135 (2018)
Vijg, J.: Somatic mutations, genome mosaicism, cancer and aging. Curr. Opin. Genet. Dev. 26(26C), 141–149 (2014)
Xie, M., Lu, C., Wang, J., et al.: Age-related cancer mutations associated with clonal hematopoietic expansion. Nature Med. 20(12), 1472–1478 (2014)
Geng, Yu., Zhao, Z., Liu, R., Zheng, T., Xu, J., Huang, Y., Zhang, X., Xiao, X., Wang, J.: Accurately estimating tumor purity of samples with high degree of heterogeneity from cancer sequencing data. In: Huang, D.-S., Jo, K.-H., Figueroa-García, J.C. (eds.) ICIC 2017. LNCS, vol. 10362, pp. 273–285. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63312-1_25
Hu, X., Yuan, J., Shi, Y., et al.: pIRS: Profile-based Illumina pair-end reads simulator. Bioinformatics 28(11), 1533–1535 (2012)
Huang, W., Li, L., Myers, J., et al.: ART: a next-generation sequencing read simulator. Bioinformatics 28(4), 593–594 (2012)
McElroy, K., Luciani, F., Thomas, T.: GemSIM: general, error-model based simulator of next-generation sequencing data. BMC Genom. 13(74), 1–9 (2012)
Geng, Y., Zhao, Z., Xu, J., et al.: Identifying heterogeneity patterns of allelic imbalance on germline variants to infer clonal architecture. In: Huang, D., Jo, K., Figueroa-García, J. (eds.) ICIC 2017. LNCS, vol. 10362, pp. 286–297. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63312-1_26
Miller, C., White, B., Dees, N., et al.: SciClone: Inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution. PLoS Comput. Biol. 10(8), e1003665 (2014)
Acknowledgement
This work is supported by the National Science Foundation of China (Grant No: 31701150) and the Fundamental Research Funds for the Central Universities (CXTD2017003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Geng, Y., Zhao, Z., Xu, M., Zhang, X., Xiao, X., Wang, J. (2018). TNSim: A Tumor Sequencing Data Simulator for Incorporating Clonality Information. In: Huang, DS., Jo, KH., Zhang, XL. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10955. Springer, Cham. https://doi.org/10.1007/978-3-319-95933-7_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-95933-7_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95932-0
Online ISBN: 978-3-319-95933-7
eBook Packages: Computer ScienceComputer Science (R0)