Abstract
Full-length analysis of genes with highly repetitive sequences is challenging in two respects: assembly algorithm and sequencing accuracy. The de Bruijn graph often used in short-read assembly cannot distinguish adjacent repeat units. On the other hand, the accuracy of long reads is not yet high enough to identify each and every repeat unit. In this chapter, I present an example of a strategy to solve these problems and obtain the full length of long repeats by combining the extraction and assembly of repeat units based on overlap-layout-consensus and scaffolding by long reads.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kono N, Nakamura H, Ohtoshi R, Moran DAP, Shinohara A, Yoshida Y, Fujiwara M, Mori M, Tomita M, Arakawa K (2019) Orb-weaving spider Araneus ventricosus genome elucidates the spidroin gene catalogue. Sci Rep 9(1):8380. https://doi.org/10.1038/s41598-019-44775-2
Hayashi CY, Lewis RV (2000) Molecular architecture and evolution of a modular spider silk protein gene. Science 287(5457):1477–1479
Babb PL, Lahens NF, Correa-Garhwal SM, Nicholson DN, Kim EJ, Hogenesch JB, Kuntner M, Higgins L, Hayashi CY, Agnarsson I, Voight BF (2017) The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression. Nat Genet 49(6):895–903. https://doi.org/10.1038/ng.3852
Kono N, Nakamura H, Ohtoshi R, Tomita M, Numata K, Arakawa K (2019) The bagworm genome reveals a unique fibroin gene that provides high tensile strength. Commun Biol 2:148. https://doi.org/10.1038/s42003-019-0412-8
Kono N, Nakamura H, Mori M, Tomita M, Arakawa K (2020) Spidroin profiling of cribellate spiders provides insight into the evolution of spider prey capture strategies. Sci Rep 10(1):15721. https://doi.org/10.1038/s41598-020-72888-6
Kono N, Nakamura H, Mori M, Yoshida Y, Ohtoshi R, Malay AD, Pedrazzoli Moran DA, Tomita M, Numata K, Arakawa K (2021) Multicomponent nature underlies the extraordinary mechanical properties of spider dragline silk. Proc Natl Acad Sci U S A 118(31):2021.2004.2022.441049. https://doi.org/10.1073/pnas.2107065118
Kono N, Nakamura H, Tateishi A, Numata K, Arakawa K (2021) The balance of crystalline and amorphous regions in the fibroin structure underpins the tensile strength of bagworm silk. Zool Lett 7(1):11. https://doi.org/10.1186/s40851-021-00179-7
Kono N, Ohtoshi R, Malay AD, Mori M, Masunaga H, Yoshida Y, Nakamura H, Numata K, Arakawa K (2021) Darwin's bark spider shares a spidroin repertoire with Caerostris extrusa but achieves extraordinary silk toughness through gene expression. Open Biol 11(12):210242. https://doi.org/10.1098/rsob.210242
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Chang Z, Li G, Liu J, Zhang Y, Ashby C, Liu D, Cramer CL, Huang X (2015) Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol 16:30. https://doi.org/10.1186/s13059-015-0596-2
Shen W, Le S, Li Y, Hu F (2016) SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS One 11(10):e0163962. https://doi.org/10.1371/journal.pone.0163962
Hackl T, Hedrich R, Schultz J, Forster F (2014) proovread: large-scale high-accuracy PacBio correction through iterative short read consensus. Bioinformatics 30(21):3004–3011. https://doi.org/10.1093/bioinformatics/btu392
Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, Lightfoot S, Menzel W, Granzow M, Ragg T (2006) The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol 7:3. https://doi.org/10.1186/1471-2199-7-3
Kono N, Nakamura H, Ito Y, Tomita M, Arakawa K (2016) Evaluation of the impact of RNA preservation methods of spiders for de novo transcriptome assembly. Mol Ecol Resour 16(3):662–672. https://doi.org/10.1111/1755-0998.12485
Kono N, Arakawa K (2019) Nanopore sequencing: review of potential applications in functional genomics. Dev Growth Differ 61(5):316–326. https://doi.org/10.1111/dgd.12608
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Kono, N. (2023). A Guide to Sequencing for Long Repetitive Regions. In: Arakawa, K. (eds) Nanopore Sequencing. Methods in Molecular Biology, vol 2632. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2996-3_10
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2996-3_10
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2995-6
Online ISBN: 978-1-0716-2996-3
eBook Packages: Springer Protocols