Skip to main content

A Guide to Sequencing for Long Repetitive Regions

  • Protocol
  • First Online:
Nanopore Sequencing

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2632))

  • 1625 Accesses

Abstract

Full-length analysis of genes with highly repetitive sequences is challenging in two respects: assembly algorithm and sequencing accuracy. The de Bruijn graph often used in short-read assembly cannot distinguish adjacent repeat units. On the other hand, the accuracy of long reads is not yet high enough to identify each and every repeat unit. In this chapter, I present an example of a strategy to solve these problems and obtain the full length of long repeats by combining the extraction and assembly of repeat units based on overlap-layout-consensus and scaffolding by long reads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kono N, Nakamura H, Ohtoshi R, Moran DAP, Shinohara A, Yoshida Y, Fujiwara M, Mori M, Tomita M, Arakawa K (2019) Orb-weaving spider Araneus ventricosus genome elucidates the spidroin gene catalogue. Sci Rep 9(1):8380. https://doi.org/10.1038/s41598-019-44775-2

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Hayashi CY, Lewis RV (2000) Molecular architecture and evolution of a modular spider silk protein gene. Science 287(5457):1477–1479

    Article  CAS  PubMed  Google Scholar 

  3. Babb PL, Lahens NF, Correa-Garhwal SM, Nicholson DN, Kim EJ, Hogenesch JB, Kuntner M, Higgins L, Hayashi CY, Agnarsson I, Voight BF (2017) The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression. Nat Genet 49(6):895–903. https://doi.org/10.1038/ng.3852

    Article  CAS  PubMed  Google Scholar 

  4. Kono N, Nakamura H, Ohtoshi R, Tomita M, Numata K, Arakawa K (2019) The bagworm genome reveals a unique fibroin gene that provides high tensile strength. Commun Biol 2:148. https://doi.org/10.1038/s42003-019-0412-8

    Article  PubMed  PubMed Central  Google Scholar 

  5. Kono N, Nakamura H, Mori M, Tomita M, Arakawa K (2020) Spidroin profiling of cribellate spiders provides insight into the evolution of spider prey capture strategies. Sci Rep 10(1):15721. https://doi.org/10.1038/s41598-020-72888-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kono N, Nakamura H, Mori M, Yoshida Y, Ohtoshi R, Malay AD, Pedrazzoli Moran DA, Tomita M, Numata K, Arakawa K (2021) Multicomponent nature underlies the extraordinary mechanical properties of spider dragline silk. Proc Natl Acad Sci U S A 118(31):2021.2004.2022.441049. https://doi.org/10.1073/pnas.2107065118

    Article  CAS  Google Scholar 

  7. Kono N, Nakamura H, Tateishi A, Numata K, Arakawa K (2021) The balance of crystalline and amorphous regions in the fibroin structure underpins the tensile strength of bagworm silk. Zool Lett 7(1):11. https://doi.org/10.1186/s40851-021-00179-7

    Article  Google Scholar 

  8. Kono N, Ohtoshi R, Malay AD, Mori M, Masunaga H, Yoshida Y, Nakamura H, Numata K, Arakawa K (2021) Darwin's bark spider shares a spidroin repertoire with Caerostris extrusa but achieves extraordinary silk toughness through gene expression. Open Biol 11(12):210242. https://doi.org/10.1098/rsob.210242

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120. https://doi.org/10.1093/bioinformatics/btu170

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Chang Z, Li G, Liu J, Zhang Y, Ashby C, Liu D, Cramer CL, Huang X (2015) Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol 16:30. https://doi.org/10.1186/s13059-015-0596-2

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Shen W, Le S, Li Y, Hu F (2016) SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS One 11(10):e0163962. https://doi.org/10.1371/journal.pone.0163962

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Hackl T, Hedrich R, Schultz J, Forster F (2014) proovread: large-scale high-accuracy PacBio correction through iterative short read consensus. Bioinformatics 30(21):3004–3011. https://doi.org/10.1093/bioinformatics/btu392

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, Lightfoot S, Menzel W, Granzow M, Ragg T (2006) The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol 7:3. https://doi.org/10.1186/1471-2199-7-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Kono N, Nakamura H, Ito Y, Tomita M, Arakawa K (2016) Evaluation of the impact of RNA preservation methods of spiders for de novo transcriptome assembly. Mol Ecol Resour 16(3):662–672. https://doi.org/10.1111/1755-0998.12485

    Article  CAS  PubMed  Google Scholar 

  15. Kono N, Arakawa K (2019) Nanopore sequencing: review of potential applications in functional genomics. Dev Growth Differ 61(5):316–326. https://doi.org/10.1111/dgd.12608

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nobuaki Kono .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Kono, N. (2023). A Guide to Sequencing for Long Repetitive Regions. In: Arakawa, K. (eds) Nanopore Sequencing. Methods in Molecular Biology, vol 2632. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2996-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2996-3_10

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2995-6

  • Online ISBN: 978-1-0716-2996-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics