Consensus-Based Identification and Comparative Analysis of Structural Variants and Their Influence on 3D Genome Structure Using Long- and Short-Read Sequencing Technologies in Polish Families

Chiliński, Mateusz; Gadakh, Sachin; Sengupta, Kaustav; Jodkowska, Karolina; Zawrotna, Natalia; Gawor, Jan; Pietal, Michal; Plewczynski, Dariusz

doi:10.1007/978-981-19-0105-8_5

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 404))

329 Accesses

Abstract

Structural variants (SVs) such as deletions, duplications, insertions, or inversions are alterations in the human genome that may be linked to the development of human diseases. A wide range of technologies are currently available to detect and analyze SVs, but the restrictions of each of the methods are resulting in lower total accuracy. Over the past few years, the need to develop a reliable computational pipeline has arisen to merge and compare the SVs from various tools to get accurate SVs for downstream analysis. In this study, we performed a detailed analysis of long-read sequencing of the human genome and compared it with short-read sequencing using Illumina technology in terms of the distribution of structural variants (SVs). The SVs were identified in two families with three members each (mother, father, son) using fifteen independent SV callers. Then we utilized ConsensuSV algorithm to merge the results of these SVs callers to identify the reliable list of SVs for each member of the families. Furthermore, we studied the influence of SVs on chromatin interaction-based paired-end tags (PETs). Finally, while we compared the length and number-wise distribution between long-read-based and short-read-based SVs and their respective mapping on PETs. We conclude that SVs detected by our algorithm over sequencing data using ONT are superior compared to Illumina across all SV sizes and lengths, as well as the number of mapped to PETs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Crick F, Watson J (1953) A structure for deoxyribose nucleic acid. Nature 171:3
Article Google Scholar
Kchouk M, Gibrat J-F, Elloumi M (2017) Generations of sequencing technologies: from first to next generation. Biol Med 9
Google Scholar
Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci 74:5463–5467
Article Google Scholar
Maxam AM, Gilbert W (1977) A new method for sequencing DNA. Proc Natl Acad Sci 74:560–564
Article Google Scholar
Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17:239. https://doi.org/10.1186/s13059-016-1103-0
Article Google Scholar
Amarasinghe SL, Su S, Dong X et al (2020) Opportunities and challenges in long-read sequencing data analysis. Genome Biol 21:1–16
Article Google Scholar
Consortium IH (2005) A haplotype map of the human genome. Nature 437:1299
Article Google Scholar
Haraksingh RR, Snyder MP (2013) Impacts of variation in the human genome on gene regulation. J Mol Biol 425:3970–3977
Article Google Scholar
Auton A, Abecasis GR, Altshuler DM et al (2015) A global reference for human genetic variation. Nature 526:68–74. https://doi.org/10.1038/nature15393
Article Google Scholar
Fullwood MJ, Ruan Y (2009) ChIP-based methods for the identification of long-range chromatin interactions. J Cell Biochem 107:30–39
Article Google Scholar
Chen K, Wallis JW, McLellan MD et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6:677–681. https://doi.org/10.1038/nmeth.1363
Article Google Scholar
Abyzov A, Li S, Kim DR et al (2015) Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms. Nat Commun 6:7256. https://doi.org/10.1038/ncomms8256
Article Google Scholar
Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21:974–984. https://doi.org/10.1101/gr.114876.110
Article Google Scholar
Rausch T, Zichner T, Schlattl A et al (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28:i333–i339. https://doi.org/10.1093/bioinformatics/bts378
Article Google Scholar
Layer RM, Chiang C, Quinlan AR, Hall IM (2014) LUMPY: a probabilistic framework for structural variant discovery. Genome Biol 15:R84. https://doi.org/10.1186/gb-2014-15-6-r84
Article Google Scholar
Handsaker RE, Van Doren V, Berman JR et al (2015) Large multiallelic copy number variations in humans. Nat Genet 47:296–303. https://doi.org/10.1038/ng.3200
Article Google Scholar
Chen X, Schulz-Trieglaff O, Shaw R et al (2016) Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32:1220–1222. https://doi.org/10.1093/bioinformatics/btv710
Article Google Scholar
Ye K, Schulz MH, Long Q et al (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25:2865–2871. https://doi.org/10.1093/bioinformatics/btp394
Article Google Scholar
Zhao X, Emery SB, Myers B et al (2016) Resolving complex structural genomic rearrangements using a randomized approach. Genome Biol 17:126. https://doi.org/10.1186/s13059-016-0993-1
Article Google Scholar
Soylev A, Kockan C, Hormozdiari F, Alkan C (2017) Toolkit for automated and rapid discovery of structural variants. Methods 129:3–7. https://doi.org/10.1016/j.ymeth.2017.05.030
Article Google Scholar
Chong Z, Ruan J, Gao M et al (2017) novoBreak: local assembly for breakpoint detection in cancer genomes. Nat Methods 14:65–67. https://doi.org/10.1038/nmeth.4084
Article Google Scholar
Kronenberg ZN, Osborne EJ, Cone KR et al (2015) Wham: Identifying Structural Variants of Biological Consequence. PLoS Comput Biol 11:e1004572–e1004572. https://doi.org/10.1371/journal.pcbi.1004572
Article Google Scholar
Sedlazeck FJ, Rescheneder P, Smolka M et al (2018) Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 15:461–468. https://doi.org/10.1038/s41592-018-0001-7
Article Google Scholar
Jiang T, Liu Y, Jiang Y et al (2020) Long-read-based human genomic structural variation detection with cuteSV. Genome Biol 21:189. https://doi.org/10.1186/s13059-020-02107-y
Article Google Scholar
Heller D, Vingron M (2019) SVIM: structural variant identification using mapped long reads. Bioinformatics 35:2907–2915. https://doi.org/10.1093/bioinformatics/btz041
Article Google Scholar
Dekker J, Belmont AS, Guttman M et al (2017) The 4D nucleome project. Nature 549:219–226. https://doi.org/10.1038/nature23884
Article Google Scholar

Download references

Acknowledgements

This work has been supported by National Science Centre, Poland (2019/35/O/ST6/02484 and 2020/37/B/NZ2/03757); Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund (TEAM to DP). The work has been co-supported by European Commission Horizon 2020 Marie Skłodowska-Curie ITN Enhpathy grant “Molecular Basis of Human enhanceropathies” and National Institute of Health USA 4DNucleome grant 1U54DK107967-01 “Nucleome Positioning System for Spatiotemporal Genome Organization and Regulation.” Research was co-funded by Warsaw University of Technology within the Excellence Initiative: Research University (IDUB) programme. Computations were performed thanks to the Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology using the Artificial Intelligence HPC platform financed by the Polish Ministry of Science and Higher Education (decision no. 7054/IA/SP/2020 of 2020-08-28).

Author information

Authors and Affiliations

Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662, Warsaw, Poland
Mateusz Chiliński & Dariusz Plewczynski
Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097, Warsaw, Poland
Mateusz Chiliński, Sachin Gadakh, Kaustav Sengupta, Karolina Jodkowska, Natalia Zawrotna & Dariusz Plewczynski
Centre for Advanced Materials and Technologies (CEZAMAT), Warsaw University of Technology, Poleczki 19, 02-822, Warsaw, Poland
Karolina Jodkowska
DNA Sequencing and Oligonucleotide Synthesis Laboratory, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5a, 02-106, Warsaw, Poland
Jan Gawor
Faculty of Computer and Electrical Engineering, Rzeszow University of Technology, Powstańców Warszawy 12, 35-959, Rzeszów, Poland
Michal Pietal

Authors

Mateusz Chiliński
View author publications
You can also search for this author in PubMed Google Scholar
Sachin Gadakh
View author publications
You can also search for this author in PubMed Google Scholar
Kaustav Sengupta
View author publications
You can also search for this author in PubMed Google Scholar
Karolina Jodkowska
View author publications
You can also search for this author in PubMed Google Scholar
Natalia Zawrotna
View author publications
You can also search for this author in PubMed Google Scholar
Jan Gawor
View author publications
You can also search for this author in PubMed Google Scholar
Michal Pietal
View author publications
You can also search for this author in PubMed Google Scholar
Dariusz Plewczynski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dariusz Plewczynski .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
Subhadip Basu
Department of Computer Science and Engineering, Jalpaiguri Government Engineering College, Jalpaiguri, West Bengal, India
Dipak Kumar Kole
Department of Information Technology, North-Eastern Hill University, Shillong, Meghalaya, India
Arnab Kumar Maji
Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
Dariusz Plewczynski
Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
Debotosh Bhattacharjee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chiliński, M. et al. (2023). Consensus-Based Identification and Comparative Analysis of Structural Variants and Their Influence on 3D Genome Structure Using Long- and Short-Read Sequencing Technologies in Polish Families. In: Basu, S., Kole, D.K., Maji, A.K., Plewczynski, D., Bhattacharjee, D. (eds) Proceedings of International Conference on Frontiers in Computing and Systems. Lecture Notes in Networks and Systems, vol 404. Springer, Singapore. https://doi.org/10.1007/978-981-19-0105-8_5

Download citation

DOI: https://doi.org/10.1007/978-981-19-0105-8_5
Published: 28 June 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0104-1
Online ISBN: 978-981-19-0105-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics