Sequencing and Assembling the Nuclear and Organelle Genomes of North American Spruces

  • Inanc Birol
  • Amanda R. De la TorreEmail author
Part of the Compendium of Plant Genomes book series (CPG)


Reference genomes provide valuable information to study the molecular biology and the genomic architecture of species, and constitute a baseline for applied sciences such as molecular breeding and gene editing. The sequencing of conifer genomes still lags behind other plant and animal species, with only a few available conifers having full sequence genomes to date. This chapter aims to describe details on the sequencing and bioinformatics analysis of the nuclear and organelle genome assemblies of the economically important white spruce (Picea glauca), and closely related Picea species P. sitchensis and P. engelmannii. The chapter finishes by providing some perspectives for future genome assemblies of North American species.


Genome sequencing Organelle genomes Picea glauca Picea sitchensis Picea engelmannii 


  1. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456(7218):53–59CrossRefGoogle Scholar
  2. Besemer J, Borodovsky M (2005) GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res 33:451–454CrossRefGoogle Scholar
  3. Birol I, Raymond A, Jackman SD, Pleasance S, Coope R et al (2013) Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data. Bioinformatics 29(12):1492–1497CrossRefGoogle Scholar
  4. Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34(5):525–527CrossRefGoogle Scholar
  5. Campbell MS, Holt C, Moore B, Yandell M (2014) Genome Annotation and Curation Using MAKER and MAKER-P. Curr Protoc Bioinformatics 48: 4 11 1–39Google Scholar
  6. Chan QW, Cornman RS, Birol I, Liao NY, Chan SK et al (2011) Updated genome assembly and annotation of Paenibacillus larvae, the agent of American foulbrood disease of honey bees. BMC Genom 12:450CrossRefGoogle Scholar
  7. Coombe L, Warren RL, Jackman SD, Yang C, Vandervalk BP et al (2016) Assembly of the Complete Sitka Spruce Chloroplast Genome Using 10X Genomics’ GemCode Sequencing Data. PLoS ONE 11(9):e0163059CrossRefGoogle Scholar
  8. Cronn R, Liston A, Parks M, Gernandt DS, Shen R, Mockler T (2008) Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology. Nucleic Acids Res 36:e122. PMID: 18753151CrossRefPubMedPubMedCentralGoogle Scholar
  9. De La Torre AR, Birol I, Bousquet J, Ingvarsson PK, Jansson S, Jones SJM, Keeling CI, MacKay J, Nilsson O, Ritland K, Street N, Yanchuk A, Zerbe P, Bohlmann J (2014a) Insights into Conifer Giga-genomes. Plant Physiol 166:1–9CrossRefGoogle Scholar
  10. De La Torre AR, Roberts D, Aitken SN (2014b) Genome-wide admixture and ecological niche modeling reveal the maintenance of species boundaries despite long history of interspecific gene flow. Mol Ecol 23:2046–2059CrossRefGoogle Scholar
  11. Diguistini S, Liao NY, Platt D, Robertson G, Seidel M, Chan SK et al (2009) De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biol 10(9):R94CrossRefGoogle Scholar
  12. Diguistini S, Wang Y, Liao NY, Taylor G, Tanguay P, Feau N et al (2011) Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen. Proc Natl Acad Sci U S A 108(6):2504–2509CrossRefGoogle Scholar
  13. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G et al (2009) Real-time DNA sequencing from single polymerase molecules. Science 323(5910):133–138CrossRefGoogle Scholar
  14. Feau N, Taylor G, Dale AL, Dhillon B, Bilodeau GJ, Birol I et al (2016) Genome sequences of six Phytophthora species threatening forest ecosystems. Genomics Data 10:85–88CrossRefGoogle Scholar
  15. Ferragina P, Manzini G (2000) Opportunistic data structures with applications, in 41st Annual Symposium on Foundations of Computer Science, Proceedings 390–398Google Scholar
  16. Finn RD, Mistry J, Schuster-Bockler B, Griffiths-Jones S, Hollich V, Lassmann T et al (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34:D247–D251CrossRefGoogle Scholar
  17. Haridas S, Wang Y, Lim L, Alamouti SM, Jackman S, Docking R et al (2013) The genome and transcriptome of the pine saprophyte Ophiostoma piceae, and a comparison with the bark beetle-associated pine pathogen Grosmannia clavigera. BMC Genom 14:373CrossRefGoogle Scholar
  18. Hammond SA, Warren RL, Vandervalk BP, Kucuk E, Khan H, Gibb EA et al (2017) The North American bullfrog draft genome provides insight into hormonal regulation of long noncoding RNA. Nature Communications 8(1):1433CrossRefGoogle Scholar
  19. Hatem A, Bozdag D, Toland AE, Catalyurek UV (2013) Benchmarking short sequence mapping tools. BMC Bioinformatics 14:184CrossRefGoogle Scholar
  20. Holt C, Yandell M (2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491CrossRefGoogle Scholar
  21. Jackman SD, Warren RL, Gibb EA, Vandervalk BP, Mohamadi H, Chu J et al (2016) Organellar Genomes of White Spruce (Picea glauca): Assembly and Annotation. Genome Biology and Evolution 8(1):29–41CrossRefGoogle Scholar
  22. Jones SJ, Haulena M, Taylor GA, Chan S, Bilobram S, Warren RL (2017) The Genome of the Northern Sea Otter (Enhydra lutris kenyoni). Genes (Basel) 8(12)Google Scholar
  23. Jones SJM, Taylor GA, Chan S, Warren RL, Hammond SA, Bilobram S, Mordecai G et al (2017) The Genome of the Beluga Whale (Delphinapterus leucas). Genes (Basel) 8(12)Google Scholar
  24. Keeling CI, Yuen MMS, Liao NY, Docking TR, Chan SK, Taylor GA et al (2013) Draft genome of the mountain pine beetle, Dendroctonus ponderosae Hopkins, a major forest pest. Genome Biol 14(3):R27CrossRefGoogle Scholar
  25. Korf I (2004) Gene finding in novel genomes. BMC Bioinformatics 5:59CrossRefGoogle Scholar
  26. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(4):357–359CrossRefGoogle Scholar
  27. Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18(11):1851–1858CrossRefGoogle Scholar
  28. Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16(6):321–332CrossRefGoogle Scholar
  29. Lin D, Coombe L, Jackman SD, Gagalova KK, Warren RL, Hammond SA, Kirk H et al (2019) Complete Chloroplast Genome Sequence of a White Spruce (Picea glauca, Genotype WS77111) from Eastern Canada. Microbiology Resource Announcements 8(23)Google Scholar
  30. Liu J, Xiao H, Huang S, Li F (2014) OMIGA: Optimized Maker-Based Insect Genome Annotation. Mol Genet Genomics 289(4):567–573CrossRefGoogle Scholar
  31. Nystedt B, Street NR, Wetterbom A, Zuccolo A, LIn YC, Scofield DG (2013) The Norway spruce genome sequence and conifer genome evolution. Nature 497(7451): 579–84Google Scholar
  32. Patro R, Mount SM, Kingsford C (2014) Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32(5):462–464CrossRefGoogle Scholar
  33. Paulino D, Warren RL, Vandervalk BP, Raymond A, Jackman SD, Birol I (2015) Sealer: a scalable gap-closing application for finishing draft genomes. BMC Bioinformatics 16:230. PMID: 26209068CrossRefPubMedPubMedCentralGoogle Scholar
  34. Pevzner PA, Tang H (2001) Fragment assembly with double-barreled data. Bioinformatics 17(Suppl 1):S225–S233CrossRefGoogle Scholar
  35. Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R (2005) InterProScan: protein domains identifier. Nucleic Acids Res 33:W116–W120CrossRefGoogle Scholar
  36. Robertson G, Schein J, Chiu R, Corbett R, Field M, Jackman SD et al (2010) De novo assembly and analysis of RNA-seq data. Nat Methods 7(11):909–912CrossRefGoogle Scholar
  37. Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes JC (1977) Nucleotide sequence of bacteriophage phi X174 DNA. Nature 265(5596):687–695CrossRefGoogle Scholar
  38. Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30(14):2068–2069CrossRefGoogle Scholar
  39. Simpson JT, Wong K, Jackman SD, Schein JD, Jones SJM, Birol I (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19(6):1117–1123CrossRefGoogle Scholar
  40. Stanke M, Keller O, Gunduz I, Hayes A, Waack S, Morgenstern B (2006) AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Research 34(Web Server issue): W435–9Google Scholar
  41. Vallenet D, Labarre L, Rouy Z, Barbe V, Bocs S, Cruvellier S et al (2006) MaGe: a microbial genome annotation system supported by synteny results. Nucleic Acids Res 34(1):53–65CrossRefGoogle Scholar
  42. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A et al (2013) From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 43: 11 10 1–33Google Scholar
  43. Warren RL, Keeling CI, Yuen MMS, Raymond A, Taylor GA, Vandervalk BP et al (2015) Improved white spruce (Picea glauca) genome assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism. Plant J 83(2):189–212CrossRefGoogle Scholar
  44. Wick RR, Judd LM, Gorrie CL, Holt KE (2017) Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads Phillippy, AM, editor. PLoS Comput Biol 13:e1005595. Scholar
  45. Wyman SK, Jansen RK, Boore JL (2004) Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20(17):3252–3255CrossRefGoogle Scholar
  46. Zheng GX, Lau BT, Schnall-Levin M, Jarosz M, Bell JM, Hindson CM et al (2016) Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat Biotechnol 34(3):303–311CrossRefGoogle Scholar
  47. Zimin A, Stevens KA, Crepeau MW, Holtz-Morris A, Koriabine M, Marcais G et al (2014) Sequencing and assembly of the 22-gb loblolly pine genome. Genetics 196(3):875–890CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Medical GeneticsMichael Smith Genome Sciences Centre, University of British ColumbiaVancouverCanada
  2. 2.School of ForestryNorthern Arizona UniversityFlagstaffUSA

Personalised recommendations