Abstract
Copy number variations (CNVs) have increasingly been reported to cause, or predispose to, human disease. However, a large fraction of these CNVs have not been accurately characterized at the single-base-pair level, thereby hampering a better understanding of the mutational mechanisms underlying CNV formation. Here, employing a composite pipeline method derived from various inference-based programs, we have characterized 26 deletion CNVs [including three novel pathogenic CNVs involving an autosomal gene (EXT2) causing hereditary osteochondromas and an X-linked gene (CLCN5) causing Dent disease, as well as 23 CNVs previously identified by inference from a cohort of Canadian autism spectrum disorder families] to the single-base-pair level of accuracy from whole-genome sequencing data. We found that breakpoint-flanking micro-mutations (within 22 bp of the breakpoint) are present in a significant fraction (5/26; 19 %) of the deletion CNVs. This analysis also provided evidence that a recently described error-prone form of DNA repair (i.e., repair of DNA double-strand breaks by templated nucleotide sequence insertions derived from distant regions of the genome) not only causes human genetic disease but also impacts on human genome evolution. Our findings illustrate the importance of precise CNV breakpoint delineation for understanding the underlying mutational mechanisms and have implications for primer design in relation to the detection of deletion CNVs in clinical diagnosis.
Similar content being viewed by others
References
Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21:974–984
Arlt MF, Rajendran S, Birkeland SR, Wilson TE, Glover TW (2012) De novo CNV formation in mouse embryonic stem cells occurs in the absence of Xrcc4-dependent nonhomologous end joining. PLoS Genet 8:e1002981
Audrézet MP, Chen JM, Raguénès O, Chuzhanova N, Giteau K, Le Maréchal C, Quéré I, Cooper DN, Férec C (2004) Genomic rearrangements in the CFTR gene: extensive allelic heterogeneity and diverse mutational mechanisms. Hum Mutat 23:343–357
Bovee JV (2008) Multiple osteochondromas. Orphanet J Rare Dis 3:3
Cardoso-Moreira M, Arguello JR, Clark AG (2012) Mutation spectrum of Drosophila CNVs revealed by breakpoint sequencing. Genome Biol 13:R119
Carvalho CM, Pehlivan D, Ramocki MB, Fang P, Alleva B, Franco LM, Belmont JW, Hastings PJ, Lupski JR (2013) Replicative mechanisms for CNV formation are error prone. Nat Genet 45:1319–1326
Chen JM, Chuzhanova N, Stenson PD, Férec C, Cooper DN (2005) Complex gene rearrangements caused by serial replication slippage. Hum Mutat 26:125–134
Chen JM, Cooper DN, Chuzhanova N, Férec C, Patrinos GP (2007) Gene conversion: mechanisms, evolution and human disease. Nat Rev Genet 8:762–775
Chen JM, Férec C, Cooper DN (2009a) Closely spaced multiple mutations as potential signatures of transient hypermutability in human genes. Hum Mutat 30:1435–1448
Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP, Shi X, Fulton RS, Ley TJ, Wilson RK, Ding L, Mardis ER (2009b) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6:677–681
Chen JM, Cooper DN, Férec C, Kehrer-Sawatzki H, Patrinos GP (2010) Genomic rearrangements in inherited disease and cancer. Semin Cancer Biol 20:222–233
Chen JM, Férec C, Cooper DN (2013) Patterns and mutational signatures of tandem base substitutions causing human inherited disease. Hum Mutat 34:1119–1130
Conrad DF, Bird C, Blackburne B, Lindsay S, Mamanova L, Lee C, Turner DJ, Hurles ME (2010) Mutation spectrum revealed by breakpoint sequencing of human germline CNVs. Nat Genet 42:385–391
De S, Babu MM (2010) A time-invariant principle of genome evolution. Proc Natl Acad Sci USA 107:13004–13009
Deem A, Keszthelyi A, Blackgrove T, Vayl A, Coffey B, Mathur R, Chabes A, Malkova A (2011) Break-induced replication is highly inaccurate. PLoS Biol 9:e1000594
Devuyst O, Thakker RV (2010) Dent’s disease. Orphanet J Rare Dis 5:28
Hastings PJ, Ira G, Lupski JR (2009) A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet 5:e1000327
Hicks WM, Kim M, Haber JE (2010) Increased mutagenesis and unique mutation signature associated with mitotic gene conversion. Science 329:82–85
Huang S, Yu T, Chen Z, Yuan S, Chen S, Xu A (2012) More single-nucleotide mutations surround small insertions than small deletions in primates. Hum Mutat 33:1099–1106
Iraqui I, Chekkal Y, Jmari N, Pietrobon V, Freon K, Costes A, Lambert SA (2012) Recovery of arrested replication forks by homologous recombination is error-prone. PLoS Genet 8:e1002976
Jiang YH, Yuen RK, Jin X, Wang M, Chen N, Wu X, Ju J, Mei J, Shi Y, He M, Wang G, Liang J, Wang Z, Cao D, Carter MT, Chrysler C, Drmic IE, Howe JL, Lau L, Marshall CR, Merico D, Nalpathamkalam T, Thiruvahindrapuram B, Thompson A, Uddin M, Walker S, Luo J, Anagnostou E, Zwaigenbaum L, Ring RH, Wang J, Lajonchere C, Shih A, Szatmari P, Yang H, Dawson G, Li Y, Scherer SW (2013) Detection of clinically relevant genetic variants in autism spectrum disorder by whole-genome sequencing. Am J Hum Genet 93:249–263
Jovelin R, Cutter AD (2013) Fine-scale signatures of molecular evolution reconcile models of indel-associated mutation. Genome Biol Evol 5:978–986
Keskin H, Shen Y, Huang F, Patel M, Yang T, Ashley K, Mazin AV, Storici F (2014) Transcript-RNA-templated DNA recombination and repair. Nature 515:436–439
Kidd JM, Graves T, Newman TL, Fulton R, Hayden HS, Malig M, Kallicki J, Kaul R, Wilson RK, Eichler EE (2010) A human genome structural variation sequencing resource reveals insights into mutational mechanisms. Cell 143:837–847
Lathrop GM, Lalouel JM (1984) Easy calculations of lod scores and genetic risks on small computers. Am J Hum Genet 36:460–465
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Lieber MR (2008) The mechanism of human nonhomologous DNA end joining. J Biol Chem 283:1–5
Liu P, Erez A, Nagamani SC, Dhar SU, Kolodziejska KE, Dharmadhikari AV, Cooper ML, Wiszniewska J, Zhang F, Withers MA, Bacino CA, Campos-Acevedo LD, Delgado MR, Freedenberg D, Garnica A, Grebe TA, Hernandez-Almaguer D, Immken L, Lalani SR, McLean SD, Northrup H, Scaglia F, Strathearn L, Trapane P, Kang SH, Patel A, Cheung SW, Hastings PJ, Stankiewicz P, Lupski JR, Bi W (2011) Chromosome catastrophes involve replication mechanisms generating complex genomic rearrangements. Cell 146:889–903
McVey M, Lee SE (2008) MMEJ repair of double-strand breaks (director’s cut): deleted sequences and alternative endings. Trends Genet 24:529–538
Onozawa M, Zhang Z, Kim YJ, Goldberg L, Varga T, Bergsagel PL, Kuehl WM, Aplan PD (2014) Repair of DNA double-strand breaks by templated nucleotide sequence insertions derived from distant regions of the genome. Proc Natl Acad Sci USA 111:7729–7734
Rausch T, Zichner T, Schlattl A, Stutz AM, Benes V, Korbel JO (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28:i333–i339
Shah KA, Shishkin AA, Voineagu I, Pavlov YI, Shcherbakova PV, Mirkin SM (2012) Role of DNA polymerases in repeat-mediated genome instability. Cell Rep 2:1088–1095
Smith CE, Llorente B, Symington LS (2007) Template switching during break-induced replication. Nature 447:102–105
Thorvaldsdottir H, Robinson JT, Mesirov JP (2013) Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192
Tian D, Wang Q, Zhang P, Araki H, Yang S, Kreitman M, Nagylaki T, Hudson R, Bergelson J, Chen JQ (2008) Single-nucleotide mutation rate increases close to insertions/deletions in eukaryotes. Nature 455:105–108
Yalcin B, Wong K, Bhomra A, Goodson M, Keane TM, Adams DJ, Flint J (2012) The fine-scale architecture of structural variants in 17 mouse genomes. Genome Biol 13:R18
Acknowledgments
This work was supported by the National Natural Science Foundation of China [No. 31271342, No. 81371908, No. 81071703], the Ministry of Education of China [20110171110047], the Sun Yat-Sen University [No. 10ykyc07], the NCET-12-0564 program and the Basic Research Funds of the key Universities of China [11ykzd10], and the Institute National de la Santé et de la Recherche Médicale (INSERM), France. DNC receives financial support from BIOBASE GmbH through a license agreement with Cardiff University.
Conflict of interest
The authors are not aware of any conflict of interest.
Ethical standards
Informed consent was obtained from each participant. The study was approved by the University Ethics Committee and the principles outlined in the Declaration of Helsinki were followed.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Ye Wang, P. Su, B. Hu and W. Zhu share first authorship; Yiming Wang, J.-M. Chen and D. Huang share senior authorship.
The original WGS data of the four Chinese HMO and Dent families have been deposited in the NCBI Sequence Read Archive (SRA; http://www.ncbi.nlm.nih.gov/sra) under accession numbers SRA140245 and 140531. The Canadian autism spectrum disorder data have been previously deposited at the Autism Genetic Resource Exchange and the European Genome–Phenome Archive (http://www.ebi.ac.uk/ega/) under accession number EGAS00001000556.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Wang, Y., Su, P., Hu, B. et al. Characterization of 26 deletion CNVs reveals the frequent occurrence of micro-mutations within the breakpoint-flanking regions and frequent repair of double-strand breaks by templated insertions derived from remote genomic regions. Hum Genet 134, 589–603 (2015). https://doi.org/10.1007/s00439-015-1539-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00439-015-1539-4