Skip to main content

Consensus-Based Identification and Comparative Analysis of Structural Variants and Their Influence on 3D Genome Structure Using Long- and Short-Read Sequencing Technologies in Polish Families

  • Conference paper
  • First Online:
Proceedings of International Conference on Frontiers in Computing and Systems

Abstract

Structural variants (SVs) such as deletions, duplications, insertions, or inversions are alterations in the human genome that may be linked to the development of human diseases. A wide range of technologies are currently available to detect and analyze SVs, but the restrictions of each of the methods are resulting in lower total accuracy. Over the past few years, the need to develop a reliable computational pipeline has arisen to merge and compare the SVs from various tools to get accurate SVs for downstream analysis. In this study, we performed a detailed analysis of long-read sequencing of the human genome and compared it with short-read sequencing using Illumina technology in terms of the distribution of structural variants (SVs). The SVs were identified in two families with three members each (mother, father, son) using fifteen independent SV callers. Then we utilized ConsensuSV algorithm to merge the results of these SVs callers to identify the reliable list of SVs for each member of the families. Furthermore, we studied the influence of SVs on chromatin interaction-based paired-end tags (PETs). Finally, while we compared the length and number-wise distribution between long-read-based and short-read-based SVs and their respective mapping on PETs. We conclude that SVs detected by our algorithm over sequencing data using ONT are superior compared to Illumina across all SV sizes and lengths, as well as the number of mapped to PETs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Crick F, Watson J (1953) A structure for deoxyribose nucleic acid. Nature 171:3

    Article  Google Scholar 

  2. Kchouk M, Gibrat J-F, Elloumi M (2017) Generations of sequencing technologies: from first to next generation. Biol Med 9

    Google Scholar 

  3. Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci 74:5463–5467

    Article  Google Scholar 

  4. Maxam AM, Gilbert W (1977) A new method for sequencing DNA. Proc Natl Acad Sci 74:560–564

    Article  Google Scholar 

  5. Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17:239. https://doi.org/10.1186/s13059-016-1103-0

    Article  Google Scholar 

  6. Amarasinghe SL, Su S, Dong X et al (2020) Opportunities and challenges in long-read sequencing data analysis. Genome Biol 21:1–16

    Article  Google Scholar 

  7. Consortium IH (2005) A haplotype map of the human genome. Nature 437:1299

    Article  Google Scholar 

  8. Haraksingh RR, Snyder MP (2013) Impacts of variation in the human genome on gene regulation. J Mol Biol 425:3970–3977

    Article  Google Scholar 

  9. Auton A, Abecasis GR, Altshuler DM et al (2015) A global reference for human genetic variation. Nature 526:68–74. https://doi.org/10.1038/nature15393

    Article  Google Scholar 

  10. Fullwood MJ, Ruan Y (2009) ChIP-based methods for the identification of long-range chromatin interactions. J Cell Biochem 107:30–39

    Article  Google Scholar 

  11. Chen K, Wallis JW, McLellan MD et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6:677–681. https://doi.org/10.1038/nmeth.1363

    Article  Google Scholar 

  12. Abyzov A, Li S, Kim DR et al (2015) Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms. Nat Commun 6:7256. https://doi.org/10.1038/ncomms8256

    Article  Google Scholar 

  13. Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21:974–984. https://doi.org/10.1101/gr.114876.110

    Article  Google Scholar 

  14. Rausch T, Zichner T, Schlattl A et al (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28:i333–i339. https://doi.org/10.1093/bioinformatics/bts378

    Article  Google Scholar 

  15. Layer RM, Chiang C, Quinlan AR, Hall IM (2014) LUMPY: a probabilistic framework for structural variant discovery. Genome Biol 15:R84. https://doi.org/10.1186/gb-2014-15-6-r84

    Article  Google Scholar 

  16. Handsaker RE, Van Doren V, Berman JR et al (2015) Large multiallelic copy number variations in humans. Nat Genet 47:296–303. https://doi.org/10.1038/ng.3200

    Article  Google Scholar 

  17. Chen X, Schulz-Trieglaff O, Shaw R et al (2016) Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32:1220–1222. https://doi.org/10.1093/bioinformatics/btv710

    Article  Google Scholar 

  18. Ye K, Schulz MH, Long Q et al (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25:2865–2871. https://doi.org/10.1093/bioinformatics/btp394

    Article  Google Scholar 

  19. Zhao X, Emery SB, Myers B et al (2016) Resolving complex structural genomic rearrangements using a randomized approach. Genome Biol 17:126. https://doi.org/10.1186/s13059-016-0993-1

    Article  Google Scholar 

  20. Soylev A, Kockan C, Hormozdiari F, Alkan C (2017) Toolkit for automated and rapid discovery of structural variants. Methods 129:3–7. https://doi.org/10.1016/j.ymeth.2017.05.030

    Article  Google Scholar 

  21. Chong Z, Ruan J, Gao M et al (2017) novoBreak: local assembly for breakpoint detection in cancer genomes. Nat Methods 14:65–67. https://doi.org/10.1038/nmeth.4084

    Article  Google Scholar 

  22. Kronenberg ZN, Osborne EJ, Cone KR et al (2015) Wham: Identifying Structural Variants of Biological Consequence. PLoS Comput Biol 11:e1004572–e1004572. https://doi.org/10.1371/journal.pcbi.1004572

    Article  Google Scholar 

  23. Sedlazeck FJ, Rescheneder P, Smolka M et al (2018) Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 15:461–468. https://doi.org/10.1038/s41592-018-0001-7

    Article  Google Scholar 

  24. Jiang T, Liu Y, Jiang Y et al (2020) Long-read-based human genomic structural variation detection with cuteSV. Genome Biol 21:189. https://doi.org/10.1186/s13059-020-02107-y

    Article  Google Scholar 

  25. Heller D, Vingron M (2019) SVIM: structural variant identification using mapped long reads. Bioinformatics 35:2907–2915. https://doi.org/10.1093/bioinformatics/btz041

    Article  Google Scholar 

  26. Dekker J, Belmont AS, Guttman M et al (2017) The 4D nucleome project. Nature 549:219–226. https://doi.org/10.1038/nature23884

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by National Science Centre, Poland (2019/35/O/ST6/02484 and 2020/37/B/NZ2/03757); Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund (TEAM to DP). The work has been co-supported by European Commission Horizon 2020 Marie Skłodowska-Curie ITN Enhpathy grant “Molecular Basis of Human enhanceropathies” and National Institute of Health USA 4DNucleome grant 1U54DK107967-01 “Nucleome Positioning System for Spatiotemporal Genome Organization and Regulation.” Research was co-funded by Warsaw University of Technology within the Excellence Initiative: Research University (IDUB) programme. Computations were performed thanks to the Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology using the Artificial Intelligence HPC platform financed by the Polish Ministry of Science and Higher Education (decision no. 7054/IA/SP/2020 of 2020-08-28).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dariusz Plewczynski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chiliński, M. et al. (2023). Consensus-Based Identification and Comparative Analysis of Structural Variants and Their Influence on 3D Genome Structure Using Long- and Short-Read Sequencing Technologies in Polish Families. In: Basu, S., Kole, D.K., Maji, A.K., Plewczynski, D., Bhattacharjee, D. (eds) Proceedings of International Conference on Frontiers in Computing and Systems. Lecture Notes in Networks and Systems, vol 404. Springer, Singapore. https://doi.org/10.1007/978-981-19-0105-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-0105-8_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-0104-1

  • Online ISBN: 978-981-19-0105-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics