Molecular Breeding

, 36:115 | Cite as

Development of single nucleotide polymorphism markers in the large and complex rubber tree genome using next-generation sequence data

  • Livia Moura de Souza
  • Guilherme Toledo-Silva
  • Claudio Benicio Cardoso-Silva
  • Carla Cristina da Silva
  • Isabela Aparecida de Araujo Andreotti
  • Andre Ricardo Oliveira Conson
  • Camila Campos Mantello
  • Vincent Le Guen
  • Anete Pereira de SouzaEmail author


The development of single nucleotide polymorphism (SNP) markers provides the opportunity to improve many areas of plant breeding and population genetics. Unfortunately, for species such as the rubber tree (Hevea brasiliensis), the use of next-generation sequencing for genomic SNP discovery is very difficult because of the large genome size and the abundance of repeated sequences. Access to a set of validated SNP markers is a significant advantage for rubber researchers who wish to apply SNPs in scientific research. Here, we performed genomic sequencing of H. brasiliensis and generated 10,993,648 short reads, which were assembled into 10,071 contigs (N50 = 3078) by a de novo assembly strategy. A total of 2446 contigs presented no hits in the current H. brasiliensis genome assembly and may therefore be considered novel genomic sequences of rubber tree. A total of 143 putative polymorphic positions were selected, gene annotations were available for 58.7 % of the markers, and all of the sequences could be anchored to the released H. brasiliensis genome. These SNPs were validated in eight genotypes of H. brasiliensis and 15 F1 plants from a mapping population, resulting in 30 (20.9 %) positions correctly classified. The analysis revealed key candidate genes responsible for defence mechanisms and provided markers for further genetic improvement of Hevea in breeding programmes.


Single nucleotide polymorphism Next-generation sequencing Molecular marker Hevea brasiliensis 



The authors gratefully acknowledge the Fundação de Amparo a Pesquisa do Estado de São Paulo, the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES, Computational Biology Program and Agropolis Program) and the Conselho Nacional de Desenvolvimento Científico e Tecnológico for financial support and scholarships and a research fellowship.

Authors’ contributions

LMS performed the molecular genetic studies, helped to perform the biocomputational analysis and drafted the manuscript. GTS, CBCS and CCS performed a biocomputational analysis and drafted the manuscript. GTS, ARC, CCM and IAAA assisted in the molecular genetics studies. VLG participated in the evaluations of the molecular data and helped to draft the manuscript. APS conceived the study, participated in its design and coordination and helped to draft the manuscript. All of the authors read and approved the final manuscript.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

11032_2016_534_MOESM1_ESM.bz2 (2.2 mb)
Supplementary material 1 (BZ2 2292 kb)
11032_2016_534_MOESM2_ESM.xlsx (282 kb)
Supplementary material 2 (XLSX 282 kb)


  1. Bachlava E, Taylor CA, Tang S, Bowers JE, Mandel JR, Burke JM, Knapp SJ (2012) SNP discovery and development of a high-density genotyping array for sunflower. PLoS One 7:e29814. doi: 10.1371/journal.pone.0029814 CrossRefPubMedPubMedCentralGoogle Scholar
  2. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, Boutell JM, Bryant J, Carter RJ, Keira Cheetham R, Cox AJ, Ellis DJ, Flatbush MR, Gormley NA, Humphray SJ, Irving LJ (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53–59. doi: 10.1038/nature07517 CrossRefPubMedPubMedCentralGoogle Scholar
  3. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinform 10:421. doi: 10.1186/1471-2105-10-421 CrossRefGoogle Scholar
  4. Ching A, Caldwell KS, Jung M, Dolan M, Smith OS, Tingey S, Morgante M, Rafalski AJ (2002) SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines. BMC Genet 3:19–32CrossRefPubMedPubMedCentralGoogle Scholar
  5. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R, 1000 Genomes Project Analysis Group (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158. doi: 10.1093/bioinformatics/btr330 CrossRefPubMedPubMedCentralGoogle Scholar
  6. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43:491–498. doi: 10.1038/ng.806 CrossRefPubMedPubMedCentralGoogle Scholar
  7. Doyle JJ, Doyle JL (1987) A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem Bull 19:11–15Google Scholar
  8. Flint-Garcia SA, Thuillet AC, Yu J, Pressoir G, Romero SM, Mitchell SE, Doebley J, Kresovich S, Goodman MM, Buckler ES (2005) Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J 44:1054–1064. doi: 10.1111/j.1365-313X.2005.02591.x CrossRefPubMedGoogle Scholar
  9. Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W (1998) A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res 8:967–974PubMedPubMedCentralGoogle Scholar
  10. Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencing. Preprint at arXiv:1207.3907[q-bio.GN]
  11. Gonçalves PS, Fontes JRA (2012) Domestication and breeding of the rubber tree. In: Borém A, Lopes MTG, Clement CR, Noda H (eds) Domestication and breeding: Amazon species. UFV, Viçosa, pp 393–420Google Scholar
  12. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, Rokhsar DS (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40:D1178–D1186. doi: 10.1093/nar/gkr944 CrossRefPubMedGoogle Scholar
  13. Gupta PK, Roy JK, Prasad M (2001) Single nucleotide polymorphisms: a new paradigm for molecular marker technology and DNA polymorphism detection with emphasis on their use in plants. Curr Sci 80:524–535Google Scholar
  14. Hillier LW, Marth GT, Quinlan AR, Dooling D, Fewell G, Barnett D, Fox P, Glasscock JI, Hickenbotham M, Huang W, Magrini VJ, Richt RJ, Sander SN, Stewart DA, Stromberg M, Tsung EF, Wylie T, Schedl T, Wilson RK, Mardis ER (2008) Whole-genome sequencing and variant discovery in C. elegans. Nat Methods 5:183–188. doi: 10.1038/nmeth.1179 CrossRefPubMedGoogle Scholar
  15. Höck J, Meister G (2008) The Argonaute protein family. Genome Biol 9:210. doi: 10.1186/gb-2008-9-2-210 CrossRefPubMedPubMedCentralGoogle Scholar
  16. Jurka J, Klonowski P, Dagman V, Pelton P (1996) CENSOR—a program for identification and elimination of repetitive elements from DNA sequences. Comput Chem 20:119–121. doi: 10.1016/S0097-8485(96)80013-1 CrossRefPubMedGoogle Scholar
  17. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36. doi: 10.1186/gb-2013-14-4-r36 CrossRefPubMedPubMedCentralGoogle Scholar
  18. Kohany O, Gentles AJ, Hankus L, Jurka J (2006) Annotation, submission and screening of repetitive elements in Repbase: repbaseSubmitter and censor. BMC Bioinform 7:474. doi: 10.1186/1471-2105-7-474 CrossRefGoogle Scholar
  19. Kota R, Rudd S, Facius A, Kolesov G, Thiel T, Zhang H, Stein N, Mayer K, Graner A (2003) Snipping polymorphisms from large EST collections in barley (Hordeum vulgare L.). Mol Genet Genomics 270:24–33. doi: 10.1007/s00438-003-0891-6 CrossRefPubMedGoogle Scholar
  20. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with bowtie 2. Nat Methods 9:357–359. doi: 10.1038/nmeth.1923 CrossRefPubMedPubMedCentralGoogle Scholar
  21. Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL (2009) Searching for SNPs with cloud computing. Genome Biol 10:R134. doi: 10.1186/gb-2009-10-11-r134 CrossRefPubMedPubMedCentralGoogle Scholar
  22. Le Guen V, Gay C, Xiong TC, Souza LM, Rodier-Goud M, Seguin M (2011) Development and characterization of 296 new polymorphic microsatellite markers for rubber tree (Hevea brasiliensis). Plant Breed 130:294–296. doi: 10.1111/j.1439-0523.2010.01774.x CrossRefGoogle Scholar
  23. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. doi: 10.1093/bioinformatics/btp352 CrossRefPubMedPubMedCentralGoogle Scholar
  24. Li D, Deng Z, Qin B, Liu X, Men Z (2012) De novo assembly and characterization of bark transcriptome using Illumina sequencing and development of EST-SSR markers in rubber tree (Hevea brasiliensis Muell. Arg.). BMC Genomics 13:192. doi: 10.1186/1471-2164-13-192 CrossRefPubMedPubMedCentralGoogle Scholar
  25. Mantello CC, Suzuki FI, Souza LM, Gonçalves PS, Souza AP (2012) Microsatellite marker development for the rubber tree (hevea brasiliensis): characterization and cross-amplification in wild hevea species. BMC Res Notes 5:329. doi: 10.1186/1756-0500-5-329 CrossRefPubMedPubMedCentralGoogle Scholar
  26. Mantello CC, Cardoso-Silva CB, da Silva CC, de Souza LM, Scaloppi Junior EJ, de Souza Gonçalves P, Vicentini R, de Souza AP (2014) De novo assembly and transcriptome analysis of the rubber tree (hevea brasiliensis) and SNP markers development for rubber biosynthesis pathways. PLoS One 9:e102665. doi: 10.1371/journal.pone.0102665 CrossRefPubMedPubMedCentralGoogle Scholar
  27. Pootakham W, Chanprasert J, Jomchai N, Sangsrakru D, Yoocha T, Therawattanasuk K, Tangphatsornruang S (2011) Single nucleotide polymorphism marker development in the rubber tree, hevea brasiliensis (Euphorbiaceae). Am J Bot 98:e337–e338. doi: 10.3732/ajb.1100228 CrossRefPubMedGoogle Scholar
  28. Prochnik S, Marri PR, Desany B, Rabinowicz PD, Kodira C, Mohiuddin M, Rodriguez F, Fauquet C, Tohme J, Harkins T, Rokhsar DS, Rounsley S (2012) The cassava genome: current progress, future directions. Trop Plant Biol 5:88–94CrossRefPubMedPubMedCentralGoogle Scholar
  29. Rafalski A (2002) Applications of single nucleotide polymorphisms in crop genetics. Curr Opin Plant Biol 5:94–100. doi: 10.1016/S1369-5266(02)00240-6 CrossRefPubMedGoogle Scholar
  30. Rahman AY, Usharraj AO, Misra BB, Thottathil GP, Jayasekaran K, Feng Y, Hou S, Ong SY, Ng FL, Lee LS, Tan HS, Sakaff MK, Teh BS, Khoo BF, Badai SS, Aziz NA, Yuryev A, Knudsen B, Dionne-Laporte A, Mchunu NP (2013) Draft genome sequence of the rubber tree hevea brasiliensis. BMC Genomics 14:75. doi: 10.1186/1471-2164-14-75 CrossRefPubMedPubMedCentralGoogle Scholar
  31. Salgado LR, Koop DM, Pinheiro DG, Rivallan R, Le Guen V, Nicolás MF, de Almeida LG, Rocha VR, Magalhães M, Gerber AL, Figueira A, Cascardo JC, de Vasconcelos AR, Silva WA, Coutinho LL, Garcia D (2014) De novo transcriptome analysis of Hevea brasiliensis tissues by RNA-seq and screening for molecular markers. BMC Genomics 15:236. doi: 10.1186/1471-2164-15-236 CrossRefPubMedPubMedCentralGoogle Scholar
  32. Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115. doi: 10.1126/science.1178534 CrossRefPubMedGoogle Scholar
  33. Silva CC, Mantello CC, Campos T, Souza LM, Gonçalves PS, Souza AP (2014) Leaf-, panel- and latex-expressed sequenced tags from the rubber tree (hevea brasiliensis) under cold-stressed and suboptimal growing conditions: the development of gene-targeted functional markers for stress response. Mol Breed 34:1035–1053. doi: 10.1007/s11032-014-0095-2 CrossRefPubMedPubMedCentralGoogle Scholar
  34. Souza LM, Mantello CC, Santos MO, de Souza Gonçalves P, Souza AP (2009) Microsatellites from rubber tree (Hevea brasiliensis) for genetic diversity analysis and cross-amplification in six hevea wild species. Conserv Genet Resour 1:75–79. doi: 10.1007/s12686-009-9018-7 CrossRefGoogle Scholar
  35. Souza LM, Gazaffi R, Mantello CC, Silva CC, Garcia D et al (2013) QTL mapping of growth-related traits in a full-sib family of rubber tree (Hevea brasiliensis) evaluated in a sub-tropical climate. PLoS One 8:e61238. doi: 10.1371/journal.pone.0061238 CrossRefPubMedPubMedCentralGoogle Scholar
  36. Thiel T, Michalek W, Varshney RK, Graner A (2003) Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet 106:411–422. doi: 10.1007/s00122-002-1031-0 PubMedGoogle Scholar
  37. Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111. doi: 10.1093/bioinformatics/btp120 CrossRefPubMedPubMedCentralGoogle Scholar
  38. Triwitayakorn K, Chatkulkawin P, Kanjanawattanawong S, Sraphet S, Yoocha T, Sangsrakru D, Chanprasert J, Ngamphiw C, Jomchai N, Therawattanasuk K, Tangphatsornruang S (2011) Transcriptome sequencing of Hevea brasiliensis for development of microsatellite markers and construction of a genetic linkage map. DNA Res 18:471–482. doi: 10.1093/dnares/dsr034 CrossRefPubMedPubMedCentralGoogle Scholar
  39. Varshney RK, Beier U, Khlestkina EK, Kota R, Korzun V, Graner A, Börner A (2007) Single nucleotide polymorphisms in rye (Secale cereale L.): discovery, frequency, and applications for genome mapping and diversity studies. Theor Appl Genet 114:1105–1116. doi: 10.1007/s00122-007-0504-6 CrossRefPubMedGoogle Scholar
  40. Varshney RK, Nayak SN, May GD, Jackson SA (2009) Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol 27:522–530. doi: 10.1016/j.tibtech.2009.05.006 CrossRefPubMedGoogle Scholar
  41. You FM, Huo N, Deal KR, Gu YQ, Luo M-C, McGuire PE, Dvorak J, Anderson OD (2011) Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence. BMC Genomics 12:59. doi: 10.1186/1471-2164-12-59 CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Livia Moura de Souza
    • 1
  • Guilherme Toledo-Silva
    • 1
    • 2
  • Claudio Benicio Cardoso-Silva
    • 1
  • Carla Cristina da Silva
    • 1
  • Isabela Aparecida de Araujo Andreotti
    • 1
  • Andre Ricardo Oliveira Conson
    • 1
  • Camila Campos Mantello
    • 1
  • Vincent Le Guen
    • 3
  • Anete Pereira de Souza
    • 1
    • 4
    Email author
  1. 1.Molecular Biology and Genetic Engineering Center (CBMEG)University of Campinas (UNICAMP)CampinasBrazil
  2. 2.Department of BiochemistryFederal University of Santa CatarinaFlorianópolisBrazil
  3. 3.UMR AGAPCentre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD)MontpellierFrance
  4. 4.Department of Plant Biology, Biology InstituteUniversity of Campinas (UNICAMP)CampinasBrazil

Personalised recommendations