Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

Development of SNP and InDel markers via de novo transcriptome assembly in Sesamum indicum L.

  • 693 Accesses

  • 18 Citations


Cultivated sesame (Sesamum indicum L.) is an important oil crop because of its high oil content and quality. In order to discover the single nucleotide polymorphisms (SNPs) and insertion/deletions (InDels) in RNA-Seq, we collected a total of 33.47 Gbp of data from three sesame transcriptome datasets. A reference transcriptome covering 267,508 unigenes was constructed. Among the 37,646 transcripts with complete open reading frames, a total of 7,450 SNPs and 362 InDels were found with frequencies of one SNP per 6.66 kb and one InDel per 137 kb, respectively. Most of the SNPs were transition-type with the nucleotide transitions C–T or A–G. A total of 21 InDel types with lengths ranging from 1 to 38 bp were identified, and the short InDels (1–2 bp) were most abundant at a ratio of over 80 %. Furthermore, 4,959 (66.56 %) SNPs were detected in protein-coding regions: 2,899 (58.46 %) were synonymous and 2,060 (41.54 %) were nonsynonymous. All SNPs and InDels detected in this study were bi-allelic. Of the randomly selected 40 SNPs and 40 InDels, 92.5 % of the SNPs and 95.0 % of the InDels exhibited polymorphism according to the PCR-based and Sanger-sequenced results. Furthermore, the efficiencies of the newly developed polymorphic SNP and InDel markers were evaluated among 36 commercial sesame cultivars. More than 90.0 % of the markers displayed the expected polymorphic amplifications. The polymorphism information content values ranged from 0.05 to 0.58 with an average of 0.38. Moreover, all genotypes of the 36 commercial cultivars tested were definitively distinguished by 21 SNPs and 16 InDels. These newly identified molecular markers may provide a foundation for cultivar identification, genetic diversity analysis, qualitative and quantitative trait mapping and marker-assisted selection breeding in sesame.

This is a preview of subscription content, log in to check access.

Fig. 1


  1. Akbar F, Rabbani MA, Masood MS, Shinwari ZK (2011) Genetic diversity of sesame (Sesamum indicum L.) germplasm from Pakistan using RAPD markers. Pak J Bot 43(4):2153–2160

  2. An C, Saha S, Jenkins JN, Ma DP, Scheffler BE, Kohel RJ, Yu JZ, Stelly DM (2008) Cotton (Gossypium spp.) R2R3-MYB transcription factors SNP identification, phylogenomic characterization, chromosome localization and linkage mapping. Theor Appl Genet 116:1015–1026

  3. Anilakumar KR, Pal A, Khanum F, Bawa AS (2010) Nutritional, medicinal and industrial uses of sesame (Sesamum indicum L.) seeds—an overview. Agric Conspec Sci 75:159–168

  4. Arslan C, Uzun B, Uger S, Çağırgan MI (2007) Determination of oil content and fatty acid composition of sesame mutants suited for intensive management conditions. J Am Oil Chem Soc 84:917–920

  5. Ashri A (1998) Sesame breeding. Plant Breed Rev 16:179–228

  6. Barbazuk WB, Emrich SJ, Chen HD, Li L, Schnable PS (2007) SNP discovery via 454 transcriptome sequencing. Plant J 51(5):910–918

  7. Barchi L, Lanteri S, Portis E, Acquadro A, Valè G, Toppino L, Rotino GL (2011) Identification of SNP and SSR markers in eggplant using RAD tag sequencing. BMC Genom 12(1):304

  8. Batley J, Barker G, O’Sullivan H, Edwards KJ, Edwards D (2003) Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data. Plant Physiol 132:84–91

  9. Bedigian D, Harlan JR (1986) Evidence for cultivation of sesame in the ancient world. Econ Bot 40(2):137–154

  10. Bhat KV, Babrekar PP, Lakhanpaul S (1999) Study of genetic diversity in Indian and exotic sesame (Sesamum indicum L.) germplasm using random amplified polymorphic DNA (RAPD) markers. Euphytica 110:21–33

  11. Bundock PC, Cross MJ, Shapter FM, Henry RJ (2006) Robust allele-specific PCR markers developed for SNPs in expressed barley sequences. Theor Appl Genet 112:358–365

  12. Cha RS, Zarbl H, Keohavong P, Thilly WG (1992) Mismatch amplification mutation assay (MAMA): amplification to the cH-ras gene. PCR Methods Appl 2:14–20

  13. Ching A, Caldwell KS, Jung M, Dolan M, Smith OS, Tingey S, Morgante M, Rafalski AJ (2002) SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines. BMC Genet 3:1–14

  14. Coulondre C, Miller JH, Farabaugh PJ, Gilbert W (1978) Molecular basis of base substitution hot spots in Escherichia coli. Nature 274:775–780

  15. Dixit A, Jin MH, Chung JW, Yu JW, Chung HK, Ma KH, Park YJ, Cho EG (2005) Development of polymorphic microsatellite markers in sesame (Sesamum indicum L.). Mol Ecol Notes 5:736–738

  16. Drenkard E, Richter BG, Rozen S, Stutius LM, Angell NA, Mindrinos M, Cho RJ, Oefner PJ, Davis RW, Ausubel FM (2000) A simple procedure for the analysis of single nucleotide polymorphisms facilitates map-based cloning in Arabidopsis. Plant Physiol 124:1483–1492

  17. Eckert AJ, Pande B, Ersoz ES, Wright MH, Rashbrook VK, Nicolet CM, Neale DB (2009) High-throughput genotyping and mapping of single nucleotide polymorphisms in loblolly pine (Pinus taeda L.). Tree Genet Genome 5:225–234

  18. Ercan AG, Taskin M, Turgut K (2004) Analysis of genetic diversity in Turkish sesame (Sesamum indicum L.) populations using RAPD markers. Genet Res Crop Evol 51:599–607

  19. Gaur R, Azam S, Jeena G, Khan AW, Choudhary S, Jain M, Yadav G, Tyagi AK, Chattopadhyay D, Bhatia S (2012) High-throughput SNP discovery and genotyping for constructing a saturated linkage map of chickpea (Cicer arietinum L.). DNA Res 19(5):357–373

  20. Gebremichael DE, Parzies HK (2010) Genetic variability among landraces of sesame in Ethiopia. Afr Crop Sci J 19:1–13

  21. Grivet L, Glaszmann JC, Arruda P (2001) Sequence polymorphism from EST data in sugarcane: a fine analysis of 6-phospho-gluconate dehydrogenase genes. Genet Mol Biol 24:161–167

  22. Hamilton JP, Hansey CN, Whitty BR, Stoffel K, Massa AN, Van Deynze A, De Jong WS, Douches DS, Buell CR (2011) Single nucleotide polymorphism discovery in elite north american potato germplasm. BMC Genom 12:302

  23. Hayashi K, Hashimoto N, Daigen M, Ashikawa I (2004) Development of PCR-based SNP markers for rice blast resistance genes at the Piz locus. Theor Appl Genet 108:1212–1220

  24. Hayashi K, Yoshida H, Ashikawa I (2006) Development of PCR-based allele-specific and InDel marker sets for nine rice blast resistance genes. Theor Appl Genet 113(2):251–260

  25. Hernan EL, Petr K (2006) Genetic relationship and diversity in a sesame (Sesamum indicum L.) germplasm collection using amplified fragment length polymorphism (AFLP). BMC Genet 7:10

  26. Hodgkin T, Guo QY, Zhang XR, Zhao YZ, Feng XY, Gautam P, Mahajan R, Bisht I, Loknathan T, Mathur P (1999) Developing sesame core collections in China and India. In: Johnson RC, Hodgkin T (eds) Core collections for today and tomorrow. International Plant Genetic Resources Institute, Rome, pp 74–81

  27. Holmquist R (1983) Transitions and transversions in evolutionary descent: an approach to understanding. J Mol Evol 19(2):134–144

  28. Hyten DL, Song Q, Choi IY, Yoon MS, Cregan PB (2008) High-throughput genotyping with the GoldenGate assay in the complex genome of soybean. Theor Appl Genet 116:945–952

  29. Hyten D, Song Q, Fickus E, Quigley C, Lim JS, Choi IY, Hwang EY, Pastor-Corrales M, Cregan PB (2010) High-throughput SNP discovery and assay development in common bean. BMC Genom 11(1):475

  30. Jiang D, Ye QL, Wang FS, Cao L (2010) The mining of citrus EST-SNP and its application in cultivar discrimination. Agric Sci Sin 9(2):179–190

  31. Kang CW, Kim SY, Lee SW, Mathur PN, Hodgkin T, Zhou MD, Lee JR (2006) Selection of a core collection of Korean sesame germplasm by a stepwise clustering method. Breed Sci 56(1):85–91

  32. Kim DH, Zur G, Danin-Poleg Y, Lee S, Shim K, Kang C, Kashi Y (2002) Genetic relationships of sesame germplasm collection as revealed by inter-simple sequence repeats. Plant Breed 121:259–262

  33. Kolkman JM, Berry ST, Leon AJ, Slabaugh MB, Tang S, Gao W, Shintani DK, Burke JM, Knapp SJ (2007) Single nucleotide polymorphisms and linkage disequilibrium in sunflower. Genetics 177:457–468

  34. Kumar V, Sharma SN (2011) Comparative potential of phenotypic, ISSR and SSR markers for characterization of sesame (Sesamum indicum L.) varieties from India. J Crop Sci Biotechnol 14(3):163–171

  35. Kwok S, Kellog DE, McKinney N, Spasic D, Goda L, Levenson C, Sninsky JJ (1990) Effects of primer-template mismatches on the polymerase chain reaction: human immunodeficiency virus type 1 model studies. Nucleic Acids Res 18:999–1005

  36. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25(14):1754–1760

  37. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009a) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079

  38. Li YH, Zhang C, Gao ZS, Smulders MJM, Ma Z, Liu ZX, Nan HY, Chang RZ, Qiu LJ (2009b) Development of SNP markers and haplotype analysis of the candidate gene for rhg1, which confers resistance to soybean cyst nematode in soybean. Mol Breed 24:63–76

  39. Lijavetzky D, Cabezas JA, Ibanez A, Rodriguez V, Martinez-Zapater JM (2007) High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology. BMC Genom 8:424

  40. Little S (1997) ARMS analysis of point mutatons. In: Taylor GR (ed) laboratory methods for the detection of mutations and polymorphisms in DNA. CRC Press, Boca Raton, pp 45–51

  41. Liu B, Wang Y, Zhai W, Deng J, Wang H, Cui Y, Cheng F, Wang XW, Wu J (2013) Development of InDel markers for Brassica rapa based on whole-genome re-sequencing. Theor Appl Genet 126(1):231–239

  42. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome analysis toolkit: a map reduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303

  43. Muchero W, Diop NN, Bhat PR, Fenton RD, Wanamaker S, Pottorff M, Hearne S, Cisse N, Fatokun C, Ehlers JD, Roberts PA, Clos TJ (2009) A consensus genetic map of cowpea [Vigna unguiculata (L) Walp.] and synteny based on EST-derived SNPs. Proc Natl Acad Sci USA 106:18159–18164

  44. Nakimi M (1995) The chemistry and physiological functions of sesame. Food Rev Int 11:281–329

  45. Nielsen R (2000) Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics 154:931–942

  46. Nimmakayala P, Perumal R, Mulpuri S, Reddy UK (2011) Sesamum. In: Kole C (ed) Wild corp relatives: genomic and breeding resources oilseeds. Springer, Berlin, pp 261–273

  47. Park YH, Alabady MS, Ulloa M (2005) Genetic mapping of new cotton fiber loci using EST-derived microsatellites in an interspecific recombinant inbred (RIL) cotton population. Mol Genet Genom 274:428–441

  48. Paterson AH, Brubaker C, Wendel JF (1999) A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol Biol 11:122–127

  49. Pavy N, Parsons LS, Paule C, MacKay J, Bousquet J (2006) Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for the categorization of SNPs. BMC Genom 7:174

  50. Rafalski A (2002) Applications of single nucleotide polymorphisms in crop genetics. Curr Opin Plant Biol 5(2):94–100

  51. Ramkumar G, Sivaranjani AKP, Pandey MK, Sakthivel K, Rani NS, Sudarshan I, Prasad GSV, Neeraja CN, Sundaram RM, Viraktamath BC, Madhav MS (2010) Development of a PCR-based SNP marker system for effective selection of kernel length and kernel elongation in rice. Mol Breed 26:735–740

  52. Riahi L, Zoghlami N, Fournier-Level A, Dereeper A, Le Cunff L, Laucou V, Mliki A, This P (2013) Characterization of single nucleotide polymorphism in Tunisian grapevine genome and their potential for population genetics and evolutionary studies. Genet Resour Crop Evol 60:1139–1151

  53. Rohlf FJ (2000) NTSYS-pc: numerical taxonomy and multivariate analysis system, version 2.1. Exeter Software, New York

  54. Rozas J, Rozas R (1999) DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174–175

  55. Rozas J, Sánchez-DelBarrio JC, Messegyer X, Rozas R (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:2496–2497

  56. Salem M, Vallejo RL, Leeds TD, Palti Y, Liu S, Sabbagh A, Rexroad CE, Yao JB (2012) RNA-Seq identifies SNP markers for growth traits in rainbow trout. PLoS One 7(5). doi:10.1371/journal.pone.0036264

  57. Salmaso M, Faes G, Segala C, Stefanini M, Salakhutdinov L, Zyprian E, Toepfer R, Grando MS, Velasco R (2004) Genome diversity and gene haplotypes in the grapevine (Vitis vinifera L.), as revealed by single nucleotide polymorphisms. Mol Breed 14(4):385–395

  58. Schmid KJ, Sorensen TR, Stracke R, Torjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res 13(6A):1250–1257

  59. Sneath PH, Sokal RR (1973) Numerical taxonomy: the principal and practice of numerical classification. W. H. Freeman and Company, San Francisco

  60. Sommer SS, Cassady JD, Sobell JL, Bottema CD (1989) A novel method for detecting point mutations or polymorphisms and its application to population screening for carriers of phenylketonuria. Mayo Clin Proc 64:1361–1372

  61. Spandana B, Reddy VP, Prasanna GJ, Anuradha G, Sivaramakrishnan S (2012) Development and characterization of microsatellite markers (SSR) in sesamum (Sesamum indicum L.) species. Appl Biochem Biotechnol 168:1594–1607

  62. Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH (2007) UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23(10):1282–1288

  63. Trick M, Long Y, Meng J, Bancroft I (2009) Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using Solexa transcriptome sequencing. Plant Biotechnol J 7:334–346

  64. Van K, Hwang EY, Young KM, Kim YH, Cho YI, Cregan PB, Lee SH (2004) Discovery of single nucleotide polymorphisms in soybean using primers designed from ESTs. Euphytica 139:147–157

  65. Velasco R, Zharkikh A, Troggio M, Cartwright DA, Cestaro A, Pruss D, Pindo M, Fitzgerald LM, Vezzulli S, Reid J, Malacarne G, Iliev D, Coppola G, Wardell B, Micheletti D, Macalma T, Facci M, Mitchell JT, Perazzolli M, Eldredge G, Gatto P, Oyzerski R, Moretto M, Gutin N, Stefanini M, Chen Y, Segala C, Davenport C, Dematté L, Mraz A (2007) A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLoS One. doi:10.1371/journal.pone.0001326

  66. Wang L, Zhang Y, Qi X, Gao Y, Zhang X (2012) Development and characterization of 59 polymorphic cDNA-SSR markers for the edible oil crop Sesamum indicum (Pedaliaceae). Am J Bot 99(10):394–398

  67. Wei LB, Zhang HY, Zheng YZ, Guo WZ, Zhang TZ (2008) Developing EST-derived microsatellites in sesame (Sesamum indicum L.). Acta Agron Sin 34(12):2077–2084

  68. Wei LB, Zhang HY, Zheng YZ, Miao HM, Zhang TZ, Guo WZ (2009) A genetic linkage map construction for sesame (Sesamum indicum L.). Genes. Genomics 31(2):199–208

  69. Wei WL, Qi XQ, Wang LH, Zhang YX, Hua W, Li DH, Lv HX, Zhang XR (2011) Characterization of the sesame (Sesamum indicum L.) global transcriptome using Illumina paired-end sequencing and development of EST-SSR markers. BMC Genom 12:451

  70. Wei LB, Miao HM, Zhao RH, Han XH, Zhang TD, Zhang HY (2013) Evaluation of reference genes for gene expression analysis by quantitative real-time PCR in sesame. Planta. doi:10.1007/s00425-012-805-9

  71. Westermeier P, Wenzel G, Mohler V (2009) Development and evaluation of single-nucleotide polymorphism markers in allotetraploid rapeseed (Brassica napus L.). Theor Appl Genet 119(7):1301–1311

  72. Wu X, Ren C, Joshi T, Vuong T, Xu D, Nguyen H (2010) SNP discovery by high-throughput sequencing in soybean. BMC Genom 11(1):469

  73. Xu P, Wu X, Wang B, Liu Y, Ehlers JD, Close TJ, Roberts PA, Diop NN, Qin D, Hu T (2011) A SNP and SSR based genetic map of asparagus bean (Vigna unguiculata ssp. sesquipedialis) and comparison with the broader species. PloS One. doi:10.1371/journal.pone.0015952

  74. Yan JB, Yang XH, Shah T, Sánchez-Villeda H, Li JS, Warburton M, Zhou Y, Crouch JH, Xu YB (2010) High-throughput SNP genotyping with the GoldenGate assay in maize. Mol Breed 25:441–451

  75. Yang Z, Yoder AD (1999) Estimation of the transition/transversion rate bias and species sampling. J Mol Evol 48(3):274–283

  76. Yang W, Bai X, Kabelka E, Eaton C, Kamoun S, vander Kannp E, Francis D (2004) Discovery of single nucleotide polymorphisms in Lycopersicon esculentum by computer aided analysis of expressed sequence tags. Mol Breed 14:21–34

  77. Ye S, Dhillon S, Ke X, Collins AR, Day IN (2001) An efficient procedure for genotyping single nucleotide polymorphisms. Nucleic Acids Res 29:E88

  78. Yuko A, Shiwa Y, Nagasaki H, Ebana K, Yoshikawa H, Yano M, Wakasa K (2011) Discovery of genome-wide DNA polymorphisms in a landrace cultivar of japonica rice by whole-genome sequencing. Plant Cell Physiol 52(2):274–282

  79. Zhang J, Wu YT, Guo WZ (2000a) Fast screening of microsatellite markers in cotton with PAGE/silver staining. Acta Gossypii Sin 12:267–269

  80. Zhang XR, Zhao YZ, Chen Y, Feng XY, Guo QY, Zhou MD, Hodgkin T (2000b) Establishment of sesame germplasm core collection in China. Genet Res Crop Evol 47(3):273–279

  81. Zhang HY, Wei LB, Miao HM, Zhang TD, Wang CY (2012) Development and validation of genic-SSR markers in sesame by RNA-seq. BMC Genom 13:316

  82. Zhang HY, Miao HM, Wei LB, Li CH, Zhao RH, Wang CY (2013) Genetic analysis and QTL mapping of seed coat color in sesame (Sesamum indicum L.). PLoS One. doi:10.1371/journal.pone.0063898

Download references


This work was supported by the National ‘973’ Project (Grant No. 2011C B109304), the earmarked fund for China Agriculture Research System (Grant No. CARS-15) and the National Natural Science Foundation of China (U1204318).

Author information

Correspondence to Haiyang Zhang.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 56 kb)

Supplementary material 2 (DOC 46 kb)

Supplementary material 3 (DOC 44 kb)

Additional file 4 Comparisons of three groups of sesame transcriptomes. Contigs from each of the three cultivars were screened against the A. thaliana proteome. The non-redundant sequences of three accessions to the A. thaliana proteome are shown. (JPEG 62 kb)

Additional file 5 Characteristics of the reference transcriptome. There are 229,862 non-ORF sequences and 37,646 ORF sequences in the reference transcriptome. Among the 37,646 ORF sequences, the total lengths of coding regions and non-coding regions reach 34.67 Mbp and 14.94 Mbp, respectively. (TIFF 522 kb)

Additional file 6 Information on all SNPs among the three RNA-Seq materials. SNP position and types are shown. (XLS 2,504 kb)

Additional file 7 Information on all InDels among the three RNA-Seq materials. InDels position and types are shown. (XLS 106 kb)

Supplementary material 8 (DOC 37 kb)

Supplementary material 9 (XLS 79 kb)

Supplementary material 10 (XLS 26 kb)

Supplementary material 11 (XLS 53 kb)

Supplementary material 12 (XLS 24 kb)

Supplementary material 13 (XLS 56 kb)

Supplementary material 14 (XLS 72 kb)

Supplementary material 15 (XLS 26 kb)

Supplementary material 16 (XLS 78 kb)

Additional file 17 DNA fingerprint profiles of the 36 sesame cultivars using 16 SNP markers. Sesame cultivars M1–M36 were differentiated using a panel of 16 SNP markers. (DOC 78 kb)

Supplementary material 18 (DOC 77 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wei, L., Miao, H., Li, C. et al. Development of SNP and InDel markers via de novo transcriptome assembly in Sesamum indicum L.. Mol Breeding 34, 2205–2217 (2014).

Download citation


  • Sesame
  • Single-nucleotide polymorphisms (SNPs)
  • Insertions/deletions (InDels)
  • Cultivar identification