Abstract
Structural variants (SVs) such as deletions, duplications, insertions, or inversions are alterations in the human genome that may be linked to the development of human diseases. A wide range of technologies are currently available to detect and analyze SVs, but the restrictions of each of the methods are resulting in lower total accuracy. Over the past few years, the need to develop a reliable computational pipeline has arisen to merge and compare the SVs from various tools to get accurate SVs for downstream analysis. In this study, we performed a detailed analysis of long-read sequencing of the human genome and compared it with short-read sequencing using Illumina technology in terms of the distribution of structural variants (SVs). The SVs were identified in two families with three members each (mother, father, son) using fifteen independent SV callers. Then we utilized ConsensuSV algorithm to merge the results of these SVs callers to identify the reliable list of SVs for each member of the families. Furthermore, we studied the influence of SVs on chromatin interaction-based paired-end tags (PETs). Finally, while we compared the length and number-wise distribution between long-read-based and short-read-based SVs and their respective mapping on PETs. We conclude that SVs detected by our algorithm over sequencing data using ONT are superior compared to Illumina across all SV sizes and lengths, as well as the number of mapped to PETs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Crick F, Watson J (1953) A structure for deoxyribose nucleic acid. Nature 171:3
Kchouk M, Gibrat J-F, Elloumi M (2017) Generations of sequencing technologies: from first to next generation. Biol Med 9
Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci 74:5463–5467
Maxam AM, Gilbert W (1977) A new method for sequencing DNA. Proc Natl Acad Sci 74:560–564
Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17:239. https://doi.org/10.1186/s13059-016-1103-0
Amarasinghe SL, Su S, Dong X et al (2020) Opportunities and challenges in long-read sequencing data analysis. Genome Biol 21:1–16
Consortium IH (2005) A haplotype map of the human genome. Nature 437:1299
Haraksingh RR, Snyder MP (2013) Impacts of variation in the human genome on gene regulation. J Mol Biol 425:3970–3977
Auton A, Abecasis GR, Altshuler DM et al (2015) A global reference for human genetic variation. Nature 526:68–74. https://doi.org/10.1038/nature15393
Fullwood MJ, Ruan Y (2009) ChIP-based methods for the identification of long-range chromatin interactions. J Cell Biochem 107:30–39
Chen K, Wallis JW, McLellan MD et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6:677–681. https://doi.org/10.1038/nmeth.1363
Abyzov A, Li S, Kim DR et al (2015) Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms. Nat Commun 6:7256. https://doi.org/10.1038/ncomms8256
Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21:974–984. https://doi.org/10.1101/gr.114876.110
Rausch T, Zichner T, Schlattl A et al (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28:i333–i339. https://doi.org/10.1093/bioinformatics/bts378
Layer RM, Chiang C, Quinlan AR, Hall IM (2014) LUMPY: a probabilistic framework for structural variant discovery. Genome Biol 15:R84. https://doi.org/10.1186/gb-2014-15-6-r84
Handsaker RE, Van Doren V, Berman JR et al (2015) Large multiallelic copy number variations in humans. Nat Genet 47:296–303. https://doi.org/10.1038/ng.3200
Chen X, Schulz-Trieglaff O, Shaw R et al (2016) Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32:1220–1222. https://doi.org/10.1093/bioinformatics/btv710
Ye K, Schulz MH, Long Q et al (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25:2865–2871. https://doi.org/10.1093/bioinformatics/btp394
Zhao X, Emery SB, Myers B et al (2016) Resolving complex structural genomic rearrangements using a randomized approach. Genome Biol 17:126. https://doi.org/10.1186/s13059-016-0993-1
Soylev A, Kockan C, Hormozdiari F, Alkan C (2017) Toolkit for automated and rapid discovery of structural variants. Methods 129:3–7. https://doi.org/10.1016/j.ymeth.2017.05.030
Chong Z, Ruan J, Gao M et al (2017) novoBreak: local assembly for breakpoint detection in cancer genomes. Nat Methods 14:65–67. https://doi.org/10.1038/nmeth.4084
Kronenberg ZN, Osborne EJ, Cone KR et al (2015) Wham: Identifying Structural Variants of Biological Consequence. PLoS Comput Biol 11:e1004572–e1004572. https://doi.org/10.1371/journal.pcbi.1004572
Sedlazeck FJ, Rescheneder P, Smolka M et al (2018) Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 15:461–468. https://doi.org/10.1038/s41592-018-0001-7
Jiang T, Liu Y, Jiang Y et al (2020) Long-read-based human genomic structural variation detection with cuteSV. Genome Biol 21:189. https://doi.org/10.1186/s13059-020-02107-y
Heller D, Vingron M (2019) SVIM: structural variant identification using mapped long reads. Bioinformatics 35:2907–2915. https://doi.org/10.1093/bioinformatics/btz041
Dekker J, Belmont AS, Guttman M et al (2017) The 4D nucleome project. Nature 549:219–226. https://doi.org/10.1038/nature23884
Acknowledgements
This work has been supported by National Science Centre, Poland (2019/35/O/ST6/02484 and 2020/37/B/NZ2/03757); Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund (TEAM to DP). The work has been co-supported by European Commission Horizon 2020 Marie Skłodowska-Curie ITN Enhpathy grant “Molecular Basis of Human enhanceropathies” and National Institute of Health USA 4DNucleome grant 1U54DK107967-01 “Nucleome Positioning System for Spatiotemporal Genome Organization and Regulation.” Research was co-funded by Warsaw University of Technology within the Excellence Initiative: Research University (IDUB) programme. Computations were performed thanks to the Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology using the Artificial Intelligence HPC platform financed by the Polish Ministry of Science and Higher Education (decision no. 7054/IA/SP/2020 of 2020-08-28).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chiliński, M. et al. (2023). Consensus-Based Identification and Comparative Analysis of Structural Variants and Their Influence on 3D Genome Structure Using Long- and Short-Read Sequencing Technologies in Polish Families. In: Basu, S., Kole, D.K., Maji, A.K., Plewczynski, D., Bhattacharjee, D. (eds) Proceedings of International Conference on Frontiers in Computing and Systems. Lecture Notes in Networks and Systems, vol 404. Springer, Singapore. https://doi.org/10.1007/978-981-19-0105-8_5
Download citation
DOI: https://doi.org/10.1007/978-981-19-0105-8_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0104-1
Online ISBN: 978-981-19-0105-8
eBook Packages: EngineeringEngineering (R0)