Skip to main content

Introduction to Isoform Sequencing Using Pacific Biosciences Technology (Iso-Seq)

Part of the Translational Bioinformatics book series (TRBIO,volume 9)

Abstract

Alternative RNA splicing is a known phenomenon, but we still do not have a complete catalog of isoforms that explain variability in the human transcriptome. We have made significant progress in developing methods to study variability of the transcriptome, but we are far away of having a complete picture of the transcriptome. The initial methods to study gene expression were based on cloning of cDNAs and Sanger sequencing. The strategy was labor-intensive and expensive. With the development of microarrays, different methods based on exon arrays and tiling arrays provided valuable information about RNA expression. However, the microarray presented significant limitations. Most of the limitations became apparent by 2005, but it was not until 2008 that an alternative method to study the transcriptome was developed. RNA Sequencing using next-generation sequencing (RNA-Seq) quickly became the technology of choice for gene expression profiling. Recently, the precision and sensitivity of RNA-Seq have come into question, especially for transcriptome reconstruction. This chapter will describe a relatively new method, “Isoform Sequencing” (Iso-Seq). Iso-Seq was developed by Pacific Biosciences (PacBio), and it is capable of identifying new isoforms with extraordinary precision due to its long-read technology. The technique to create libraries is straightforward, and the PacBio RS II instrument generates the information in hours. The bioinformatics analysis is performed using the freely available SMRT® Portal software. The SMRT® Portal is easy to use and capable of performing all the steps necessary to analyze the raw data and to generate high-quality full-length isoforms. For the universal acceptance of the Iso-Seq method, the capacity of the SMRT® Cells needs to improve at least 10- to 100-fold to make the system affordable and attractive to users.

Keywords

  • Isoform
  • Pacific biosciences
  • Iso-Seq
  • Pacbio
  • SMRT
  • RNA-Seq

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-94-017-7450-5_6
  • Chapter length: 20 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-94-017-7450-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Hardcover Book
USD   179.99
Price excludes VAT (USA)
Fig. 6.1
Fig. 6.2
Fig. 6.3
Fig. 6.4
Fig. 6.5
Fig. 6.6

References

  1. Abdullah-Sayani A, Bueno-de-Mesquita JM, van de Vijver MJ. Technology Insight: tuning into the genetic orchestra using microarrays–limitations of DNA microarrays in clinical practice. Nat Clin Pract Oncol. 2006;3:501–16. doi:10.1038/ncponc0587.

    CAS  CrossRef  PubMed  Google Scholar 

  2. Agarwal A, et al. Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays. BMC Genom. 2010;11:383. doi:10.1186/1471-2164-11-383.

    CrossRef  Google Scholar 

  3. Alwine JC, Kemp DJ, Stark GR. Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes. Proc Natl Acad Sci USA. 1977;74:5350–4.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  4. Au KF, et al. Characterization of the human ESC transcriptome by hybrid sequencing. Proc Natl Acad Sci USA. 2013;110:E4821–30. doi:10.1073/pnas.1320101110.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  5. Ayub M, Bayley H. Individual RNA base recognition in immobilized oligonucleotides using a protein nanopore. Nano Lett. 2012;12:5637–43. doi:10.1021/nl3027873.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  6. Bottomly D, et al. Evaluating gene expression in C57BL/6 J and DBA/2 J mouse striatum using RNA-Seq and microarrays. PLoS ONE. 2011;6:e17820. doi:10.1371/journal.pone.0017820.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  7. Carneiro MO, Russ C, Ross MG, Gabriel SB, Nusbaum C, DePristo MA. Pacific biosciences sequencing technology for genotyping and variation discovery in human data. BMC Genom. 2012;13:375. doi:10.1186/1471-2164-13-375.

    CAS  CrossRef  Google Scholar 

  8. Chaisson MJ, et al. Resolving the complexity of the human genome using single-molecule sequencing. Nature. 2015;517:608–11. doi:10.1038/nature13907.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  9. Chin CS, et al. The origin of the Haitian cholera outbreak strain. N Engl J Med. 2011;364:33–42. doi:10.1056/NEJMoa1012928.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  10. Clontech. Manual for the SMARTer PCR cDNA Synthesis Kit. 2015. http://www.clontech.com/US/Products/cDNA_Synthesis_and_Library_Construction/cDNA_Synthesis_Kits/ibcGetAttachment.jsp?cItemId=17336&fileId=6856798&sitex=10020:22372:US.

  11. Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2010;38:1767–71. doi:10.1093/nar/gkp1137.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  12. Dobin A, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi:10.1093/bioinformatics/bts635.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  13. Draghici S, Khatri P, Eklund AC, Szallasi Z. Reliability and reproducibility issues in DNA microarray measurements. Trends Genet. 2006;22:101–9. doi:10.1016/j.tig.2005.12.005.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  14. Eid J, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–8. doi:10.1126/science.1162986.

    CAS  CrossRef  PubMed  Google Scholar 

  15. English AC, et al. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS ONE. 2012;7:e47768. doi:10.1371/journal.pone.0047768.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  16. Engstrom PG, et al. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods. 2013;10:1185–91. doi:10.1038/nmeth.2722.

    PubMed Central  CrossRef  PubMed  Google Scholar 

  17. Flusberg BA, et al. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat Methods. 2010;7:461–5. doi:10.1038/nmeth.1459.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  18. Gonzalez D, Kozdon JB, McAdams HH, Shapiro L, Collier J. The functions of DNA methylation by CcrM in Caulobacter crescentus: a global approach. Nucleic Acids Res. 2014;42:3720–35. doi:10.1093/nar/gkt1352.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  19. HDF_group. HDFS file format. 2015. http://www.hdfgroup.org/HDF5.

  20. Kapa_Biosystems. KAPA HiFi Enzyme. 2015. http://www.kapabiosystems.com/product-applications/products/pcr-2/kapa-hifi-pcr-kits.

  21. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12:357–60. doi:10.1038/nmeth.3317.

    CAS  CrossRef  PubMed  Google Scholar 

  22. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14:R36. doi:10.1186/gb-2013-14-4-r36.

    PubMed Central  CrossRef  PubMed  Google Scholar 

  23. Korlach J, et al. Real-time DNA sequencing from single polymerase molecules. Methods Enzymol. 2010;472:431–55. doi:10.1016/S0076-6879(10)72001-2.

    CAS  CrossRef  PubMed  Google Scholar 

  24. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9. doi:10.1038/nmeth.1923.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  25. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi:10.1186/gb-2009-10-3-r25.

    PubMed Central  CrossRef  PubMed  Google Scholar 

  26. Larsen PA, Smith TP. Application of circular consensus sequencing and network analysis to characterize the bovine IgG repertoire. BMC Immunol. 2012;13:52. doi:10.1186/1471-2172-13-52.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  27. Li JJ, Jiang CR, Brown JB, Huang H, Bickel PJ. Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation. Proc Natl Acad Sci USA. 2011;108:19867–72. doi:10.1073/pnas.1113972108.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  28. Life_Technologies. Manual for Trizol Plus. 2015a. https://tools.lifetechnologies.com/content/sfs/manuals/Trizol_Plus_man.pdf.

  29. Life_Technologies. Manual or Poly(A)Purist™ MAG Kit. 2015b. https://tools.lifetechnologies.com/content/sfs/manuals/fm_1922.pdf.

  30. Lister R, O’Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, Ecker JR. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell. 2008;133:523–36. doi:10.1016/j.cell.2008.03.029.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  31. Loman NJ, Quinlan AR. Poretools: a toolkit for analyzing nanopore sequence data. Bioinformatics. 2014;30:3399–401. doi:10.1093/bioinformatics/btu555.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  32. Martin JA, Wang Z. Next-generation transcriptome assembly. Nat Rev Genet. 2011;12:671–82. doi:10.1038/nrg3068.

    CAS  CrossRef  PubMed  Google Scholar 

  33. Mooney M, McWeeney S. Data integration and reproducibility for high-throughput transcriptomics. Int Rev Neurobiol. 2014;116:55–71. doi:10.1016/B978-0-12-801105-8.00003-5.

    CrossRef  PubMed  Google Scholar 

  34. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621–8. doi:10.1038/nmeth.1226.

    CAS  CrossRef  PubMed  Google Scholar 

  35. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008;320:1344–9. doi:10.1126/science.1158441.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  36. Nagaraj SH, Gasser RB, Ranganathan S. A hitchhiker’s guide to expressed sequence tag (EST) analysis. Briefings Bioinform. 2007;8:6–21. doi:10.1093/bib/bbl015.

    CAS  CrossRef  Google Scholar 

  37. Nawy T. End-to-end RNA sequencing. Nat Methods. 2013;10(12):1144-1145 10:1144-1145.

    Google Scholar 

  38. Pacific_Biosciences. bas.h5 reference guide. 2015a. http://files.pacb.com/software/instrument/2.0.0/bas.h5%20Reference%20Guide.pdf.

  39. Pacific_Biosciences. Metadata output guide. 2015b. http://files.pacb.com/software/instrument/2.0.0/Metadata%20Output%20Guide.pdf.

  40. Pacific_Biosciences. PacBio consumables reagents. 2015c. http://www.pacificbiosciences.com/products/consumables/reagents/.

  41. Pacific_Biosciences. PacBio datasets. 2015d. https://github.com/PacificBiosciences/DevNet/wiki/Datasets.

  42. Pacific_Biosciences. PacBio DevNet. 2015e. http://www.pacb.com/devnet/index.html.

  43. Pacific_Biosciences. PacBio SMRT Cells. 2015f. http://www.pacificbiosciences.com/products/consumables/SMRT-cells/.

  44. Pacific_Biosciences. PacBio SMRT Sample Prep web site. 2015g. https://pacbio.secure.force.com/SamplePrep.

  45. Pacific_Biosciences. PacBio software. 2015h. http://www.pacb.com/devnet/code.html.

  46. Parkinson J, Blaxter M. Expressed sequence tags: an overview. Methods Mol Biol. 2009;533:1–12. doi:10.1007/978-1-60327-136-3_1.

    CAS  CrossRef  PubMed  Google Scholar 

  47. Quick J, Quinlan AR, Loman NJ. A reference bacterial genome dataset generated on the MinION portable single-molecule nanopore sequencer. Gigascience. 2014;3:22. doi:10.1186/2047-217X-3-22.

    PubMed Central  CrossRef  PubMed  Google Scholar 

  48. Roberts A, Pimentel H, Trapnell C, Pachter L. Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics. 2011;27:2325–9. doi:10.1093/bioinformatics/btr355.

    CAS  CrossRef  PubMed  Google Scholar 

  49. Roy NC, Altermann E, Park ZA, McNabb WC. A comparison of analog and next-generation transcriptomic tools for mammalian studies. Brief Funct Genomics. 2011;10:135–50. doi:10.1093/bfgp/elr005.

    CAS  CrossRef  PubMed  Google Scholar 

  50. Sage_Science. The BluePippin System. 2015a. http://www.sagescience.com/products/bluepippin/.

  51. Sage_Science. The SageELF. 2015b. http://www.sagescience.com/products/sageelf/.

  52. Sharon D, Tilgner H, Grubert F, Snyder M. A single-molecule long-read survey of the human transcriptome. Nat Biotechnol. 2013;31:1009–14. doi:10.1038/nbt.2705.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  53. Steijger T, et al. Assessment of transcript reconstruction methods for RNA-seq. Nat Methods. 2013;10:1177–84. doi:10.1038/nmeth.2714.

    CAS  CrossRef  PubMed  Google Scholar 

  54. Steinbock LJ, Radenovic A. The emergence of nanopores in next-generation sequencing. Nanotechnology. 2015;26:074003. doi:10.1088/0957-4484/26/7/074003.

    CAS  CrossRef  PubMed  Google Scholar 

  55. Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–11. doi:10.1093/bioinformatics/btp120.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  56. Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28:511–5. doi:10.1038/nbt.1621.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  57. Walter NA, McWeeney SK, Peters ST, Belknap JK, Hitzemann R, Buck KJ. SNPs matter: impact on detection of differential expression. Nat Methods. 2007;4:679–80. doi:10.1038/nmeth0907-679.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  58. Wang K, et al. MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res. 2010;38:e178. doi:10.1093/nar/gkq622.

    PubMed Central  CrossRef  PubMed  Google Scholar 

  59. Wang L, Feng Z, Wang X, Wang X, Zhang X. DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics. 2010;26:136–8. doi:10.1093/bioinformatics/btp612.

    CrossRef  PubMed  Google Scholar 

  60. Wang L, Si Y, Dedow LK, Shao Y, Liu P, Brutnell TP. A low-cost library construction protocol and data analysis pipeline for Illumina-based strand-specific multiplex RNA-seq. PLoS One. 2011;6:e26426. doi:10.1371/journal.pone.0026426.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  61. Weber AP, Weber KL, Carr K, Wilkerson C, Ohlrogge JB. Sampling the Arabidopsis transcriptome with massively parallel pyrosequencing. Plant Physiol. 2007;144:32–42. doi:10.1104/pp.107.096677.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  62. Wilhelm BT, et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 2008;453:1239–43. doi:10.1038/nature07002.

    CAS  CrossRef  PubMed  Google Scholar 

  63. Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 2010;26:873–81. doi:10.1093/bioinformatics/btq057.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  64. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9. doi:10.1101/gr.074492.107.

    PubMed Central  CAS  CrossRef  PubMed  Google Scholar 

  65. Zheng CL, Kawane S, Bottomly D, Wilmot B. Analysis considerations for utilizing RNA-Seq to characterize the brain transcriptome. Int Rev Neurobiol. 2014;116:21–54. doi:10.1016/B978-0-12-801105-8.00002-3.

    CrossRef  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manuel L. Gonzalez-Garay .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Gonzalez-Garay, M.L. (2016). Introduction to Isoform Sequencing Using Pacific Biosciences Technology (Iso-Seq). In: Wu, J. (eds) Transcriptomics and Gene Regulation . Translational Bioinformatics, vol 9. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7450-5_6

Download citation