Abstract
Cultivated sesame (Sesamum indicum L.) is an important oil crop because of its high oil content and quality. In order to discover the single nucleotide polymorphisms (SNPs) and insertion/deletions (InDels) in RNA-Seq, we collected a total of 33.47 Gbp of data from three sesame transcriptome datasets. A reference transcriptome covering 267,508 unigenes was constructed. Among the 37,646 transcripts with complete open reading frames, a total of 7,450 SNPs and 362 InDels were found with frequencies of one SNP per 6.66 kb and one InDel per 137 kb, respectively. Most of the SNPs were transition-type with the nucleotide transitions C–T or A–G. A total of 21 InDel types with lengths ranging from 1 to 38 bp were identified, and the short InDels (1–2 bp) were most abundant at a ratio of over 80 %. Furthermore, 4,959 (66.56 %) SNPs were detected in protein-coding regions: 2,899 (58.46 %) were synonymous and 2,060 (41.54 %) were nonsynonymous. All SNPs and InDels detected in this study were bi-allelic. Of the randomly selected 40 SNPs and 40 InDels, 92.5 % of the SNPs and 95.0 % of the InDels exhibited polymorphism according to the PCR-based and Sanger-sequenced results. Furthermore, the efficiencies of the newly developed polymorphic SNP and InDel markers were evaluated among 36 commercial sesame cultivars. More than 90.0 % of the markers displayed the expected polymorphic amplifications. The polymorphism information content values ranged from 0.05 to 0.58 with an average of 0.38. Moreover, all genotypes of the 36 commercial cultivars tested were definitively distinguished by 21 SNPs and 16 InDels. These newly identified molecular markers may provide a foundation for cultivar identification, genetic diversity analysis, qualitative and quantitative trait mapping and marker-assisted selection breeding in sesame.
This is a preview of subscription content, access via your institution.

References
Akbar F, Rabbani MA, Masood MS, Shinwari ZK (2011) Genetic diversity of sesame (Sesamum indicum L.) germplasm from Pakistan using RAPD markers. Pak J Bot 43(4):2153–2160
An C, Saha S, Jenkins JN, Ma DP, Scheffler BE, Kohel RJ, Yu JZ, Stelly DM (2008) Cotton (Gossypium spp.) R2R3-MYB transcription factors SNP identification, phylogenomic characterization, chromosome localization and linkage mapping. Theor Appl Genet 116:1015–1026
Anilakumar KR, Pal A, Khanum F, Bawa AS (2010) Nutritional, medicinal and industrial uses of sesame (Sesamum indicum L.) seeds—an overview. Agric Conspec Sci 75:159–168
Arslan C, Uzun B, Uger S, Çağırgan MI (2007) Determination of oil content and fatty acid composition of sesame mutants suited for intensive management conditions. J Am Oil Chem Soc 84:917–920
Ashri A (1998) Sesame breeding. Plant Breed Rev 16:179–228
Barbazuk WB, Emrich SJ, Chen HD, Li L, Schnable PS (2007) SNP discovery via 454 transcriptome sequencing. Plant J 51(5):910–918
Barchi L, Lanteri S, Portis E, Acquadro A, Valè G, Toppino L, Rotino GL (2011) Identification of SNP and SSR markers in eggplant using RAD tag sequencing. BMC Genom 12(1):304
Batley J, Barker G, O’Sullivan H, Edwards KJ, Edwards D (2003) Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data. Plant Physiol 132:84–91
Bedigian D, Harlan JR (1986) Evidence for cultivation of sesame in the ancient world. Econ Bot 40(2):137–154
Bhat KV, Babrekar PP, Lakhanpaul S (1999) Study of genetic diversity in Indian and exotic sesame (Sesamum indicum L.) germplasm using random amplified polymorphic DNA (RAPD) markers. Euphytica 110:21–33
Bundock PC, Cross MJ, Shapter FM, Henry RJ (2006) Robust allele-specific PCR markers developed for SNPs in expressed barley sequences. Theor Appl Genet 112:358–365
Cha RS, Zarbl H, Keohavong P, Thilly WG (1992) Mismatch amplification mutation assay (MAMA): amplification to the cH-ras gene. PCR Methods Appl 2:14–20
Ching A, Caldwell KS, Jung M, Dolan M, Smith OS, Tingey S, Morgante M, Rafalski AJ (2002) SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines. BMC Genet 3:1–14
Coulondre C, Miller JH, Farabaugh PJ, Gilbert W (1978) Molecular basis of base substitution hot spots in Escherichia coli. Nature 274:775–780
Dixit A, Jin MH, Chung JW, Yu JW, Chung HK, Ma KH, Park YJ, Cho EG (2005) Development of polymorphic microsatellite markers in sesame (Sesamum indicum L.). Mol Ecol Notes 5:736–738
Drenkard E, Richter BG, Rozen S, Stutius LM, Angell NA, Mindrinos M, Cho RJ, Oefner PJ, Davis RW, Ausubel FM (2000) A simple procedure for the analysis of single nucleotide polymorphisms facilitates map-based cloning in Arabidopsis. Plant Physiol 124:1483–1492
Eckert AJ, Pande B, Ersoz ES, Wright MH, Rashbrook VK, Nicolet CM, Neale DB (2009) High-throughput genotyping and mapping of single nucleotide polymorphisms in loblolly pine (Pinus taeda L.). Tree Genet Genome 5:225–234
Ercan AG, Taskin M, Turgut K (2004) Analysis of genetic diversity in Turkish sesame (Sesamum indicum L.) populations using RAPD markers. Genet Res Crop Evol 51:599–607
Gaur R, Azam S, Jeena G, Khan AW, Choudhary S, Jain M, Yadav G, Tyagi AK, Chattopadhyay D, Bhatia S (2012) High-throughput SNP discovery and genotyping for constructing a saturated linkage map of chickpea (Cicer arietinum L.). DNA Res 19(5):357–373
Gebremichael DE, Parzies HK (2010) Genetic variability among landraces of sesame in Ethiopia. Afr Crop Sci J 19:1–13
Grivet L, Glaszmann JC, Arruda P (2001) Sequence polymorphism from EST data in sugarcane: a fine analysis of 6-phospho-gluconate dehydrogenase genes. Genet Mol Biol 24:161–167
Hamilton JP, Hansey CN, Whitty BR, Stoffel K, Massa AN, Van Deynze A, De Jong WS, Douches DS, Buell CR (2011) Single nucleotide polymorphism discovery in elite north american potato germplasm. BMC Genom 12:302
Hayashi K, Hashimoto N, Daigen M, Ashikawa I (2004) Development of PCR-based SNP markers for rice blast resistance genes at the Piz locus. Theor Appl Genet 108:1212–1220
Hayashi K, Yoshida H, Ashikawa I (2006) Development of PCR-based allele-specific and InDel marker sets for nine rice blast resistance genes. Theor Appl Genet 113(2):251–260
Hernan EL, Petr K (2006) Genetic relationship and diversity in a sesame (Sesamum indicum L.) germplasm collection using amplified fragment length polymorphism (AFLP). BMC Genet 7:10
Hodgkin T, Guo QY, Zhang XR, Zhao YZ, Feng XY, Gautam P, Mahajan R, Bisht I, Loknathan T, Mathur P (1999) Developing sesame core collections in China and India. In: Johnson RC, Hodgkin T (eds) Core collections for today and tomorrow. International Plant Genetic Resources Institute, Rome, pp 74–81
Holmquist R (1983) Transitions and transversions in evolutionary descent: an approach to understanding. J Mol Evol 19(2):134–144
Hyten DL, Song Q, Choi IY, Yoon MS, Cregan PB (2008) High-throughput genotyping with the GoldenGate assay in the complex genome of soybean. Theor Appl Genet 116:945–952
Hyten D, Song Q, Fickus E, Quigley C, Lim JS, Choi IY, Hwang EY, Pastor-Corrales M, Cregan PB (2010) High-throughput SNP discovery and assay development in common bean. BMC Genom 11(1):475
Jiang D, Ye QL, Wang FS, Cao L (2010) The mining of citrus EST-SNP and its application in cultivar discrimination. Agric Sci Sin 9(2):179–190
Kang CW, Kim SY, Lee SW, Mathur PN, Hodgkin T, Zhou MD, Lee JR (2006) Selection of a core collection of Korean sesame germplasm by a stepwise clustering method. Breed Sci 56(1):85–91
Kim DH, Zur G, Danin-Poleg Y, Lee S, Shim K, Kang C, Kashi Y (2002) Genetic relationships of sesame germplasm collection as revealed by inter-simple sequence repeats. Plant Breed 121:259–262
Kolkman JM, Berry ST, Leon AJ, Slabaugh MB, Tang S, Gao W, Shintani DK, Burke JM, Knapp SJ (2007) Single nucleotide polymorphisms and linkage disequilibrium in sunflower. Genetics 177:457–468
Kumar V, Sharma SN (2011) Comparative potential of phenotypic, ISSR and SSR markers for characterization of sesame (Sesamum indicum L.) varieties from India. J Crop Sci Biotechnol 14(3):163–171
Kwok S, Kellog DE, McKinney N, Spasic D, Goda L, Levenson C, Sninsky JJ (1990) Effects of primer-template mismatches on the polymerase chain reaction: human immunodeficiency virus type 1 model studies. Nucleic Acids Res 18:999–1005
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25(14):1754–1760
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009a) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Li YH, Zhang C, Gao ZS, Smulders MJM, Ma Z, Liu ZX, Nan HY, Chang RZ, Qiu LJ (2009b) Development of SNP markers and haplotype analysis of the candidate gene for rhg1, which confers resistance to soybean cyst nematode in soybean. Mol Breed 24:63–76
Lijavetzky D, Cabezas JA, Ibanez A, Rodriguez V, Martinez-Zapater JM (2007) High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology. BMC Genom 8:424
Little S (1997) ARMS analysis of point mutatons. In: Taylor GR (ed) laboratory methods for the detection of mutations and polymorphisms in DNA. CRC Press, Boca Raton, pp 45–51
Liu B, Wang Y, Zhai W, Deng J, Wang H, Cui Y, Cheng F, Wang XW, Wu J (2013) Development of InDel markers for Brassica rapa based on whole-genome re-sequencing. Theor Appl Genet 126(1):231–239
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome analysis toolkit: a map reduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303
Muchero W, Diop NN, Bhat PR, Fenton RD, Wanamaker S, Pottorff M, Hearne S, Cisse N, Fatokun C, Ehlers JD, Roberts PA, Clos TJ (2009) A consensus genetic map of cowpea [Vigna unguiculata (L) Walp.] and synteny based on EST-derived SNPs. Proc Natl Acad Sci USA 106:18159–18164
Nakimi M (1995) The chemistry and physiological functions of sesame. Food Rev Int 11:281–329
Nielsen R (2000) Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics 154:931–942
Nimmakayala P, Perumal R, Mulpuri S, Reddy UK (2011) Sesamum. In: Kole C (ed) Wild corp relatives: genomic and breeding resources oilseeds. Springer, Berlin, pp 261–273
Park YH, Alabady MS, Ulloa M (2005) Genetic mapping of new cotton fiber loci using EST-derived microsatellites in an interspecific recombinant inbred (RIL) cotton population. Mol Genet Genom 274:428–441
Paterson AH, Brubaker C, Wendel JF (1999) A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol Biol 11:122–127
Pavy N, Parsons LS, Paule C, MacKay J, Bousquet J (2006) Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for the categorization of SNPs. BMC Genom 7:174
Rafalski A (2002) Applications of single nucleotide polymorphisms in crop genetics. Curr Opin Plant Biol 5(2):94–100
Ramkumar G, Sivaranjani AKP, Pandey MK, Sakthivel K, Rani NS, Sudarshan I, Prasad GSV, Neeraja CN, Sundaram RM, Viraktamath BC, Madhav MS (2010) Development of a PCR-based SNP marker system for effective selection of kernel length and kernel elongation in rice. Mol Breed 26:735–740
Riahi L, Zoghlami N, Fournier-Level A, Dereeper A, Le Cunff L, Laucou V, Mliki A, This P (2013) Characterization of single nucleotide polymorphism in Tunisian grapevine genome and their potential for population genetics and evolutionary studies. Genet Resour Crop Evol 60:1139–1151
Rohlf FJ (2000) NTSYS-pc: numerical taxonomy and multivariate analysis system, version 2.1. Exeter Software, New York
Rozas J, Rozas R (1999) DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174–175
Rozas J, Sánchez-DelBarrio JC, Messegyer X, Rozas R (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:2496–2497
Salem M, Vallejo RL, Leeds TD, Palti Y, Liu S, Sabbagh A, Rexroad CE, Yao JB (2012) RNA-Seq identifies SNP markers for growth traits in rainbow trout. PLoS One 7(5). doi:10.1371/journal.pone.0036264
Salmaso M, Faes G, Segala C, Stefanini M, Salakhutdinov L, Zyprian E, Toepfer R, Grando MS, Velasco R (2004) Genome diversity and gene haplotypes in the grapevine (Vitis vinifera L.), as revealed by single nucleotide polymorphisms. Mol Breed 14(4):385–395
Schmid KJ, Sorensen TR, Stracke R, Torjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res 13(6A):1250–1257
Sneath PH, Sokal RR (1973) Numerical taxonomy: the principal and practice of numerical classification. W. H. Freeman and Company, San Francisco
Sommer SS, Cassady JD, Sobell JL, Bottema CD (1989) A novel method for detecting point mutations or polymorphisms and its application to population screening for carriers of phenylketonuria. Mayo Clin Proc 64:1361–1372
Spandana B, Reddy VP, Prasanna GJ, Anuradha G, Sivaramakrishnan S (2012) Development and characterization of microsatellite markers (SSR) in sesamum (Sesamum indicum L.) species. Appl Biochem Biotechnol 168:1594–1607
Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH (2007) UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23(10):1282–1288
Trick M, Long Y, Meng J, Bancroft I (2009) Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using Solexa transcriptome sequencing. Plant Biotechnol J 7:334–346
Van K, Hwang EY, Young KM, Kim YH, Cho YI, Cregan PB, Lee SH (2004) Discovery of single nucleotide polymorphisms in soybean using primers designed from ESTs. Euphytica 139:147–157
Velasco R, Zharkikh A, Troggio M, Cartwright DA, Cestaro A, Pruss D, Pindo M, Fitzgerald LM, Vezzulli S, Reid J, Malacarne G, Iliev D, Coppola G, Wardell B, Micheletti D, Macalma T, Facci M, Mitchell JT, Perazzolli M, Eldredge G, Gatto P, Oyzerski R, Moretto M, Gutin N, Stefanini M, Chen Y, Segala C, Davenport C, Dematté L, Mraz A (2007) A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLoS One. doi:10.1371/journal.pone.0001326
Wang L, Zhang Y, Qi X, Gao Y, Zhang X (2012) Development and characterization of 59 polymorphic cDNA-SSR markers for the edible oil crop Sesamum indicum (Pedaliaceae). Am J Bot 99(10):394–398
Wei LB, Zhang HY, Zheng YZ, Guo WZ, Zhang TZ (2008) Developing EST-derived microsatellites in sesame (Sesamum indicum L.). Acta Agron Sin 34(12):2077–2084
Wei LB, Zhang HY, Zheng YZ, Miao HM, Zhang TZ, Guo WZ (2009) A genetic linkage map construction for sesame (Sesamum indicum L.). Genes. Genomics 31(2):199–208
Wei WL, Qi XQ, Wang LH, Zhang YX, Hua W, Li DH, Lv HX, Zhang XR (2011) Characterization of the sesame (Sesamum indicum L.) global transcriptome using Illumina paired-end sequencing and development of EST-SSR markers. BMC Genom 12:451
Wei LB, Miao HM, Zhao RH, Han XH, Zhang TD, Zhang HY (2013) Evaluation of reference genes for gene expression analysis by quantitative real-time PCR in sesame. Planta. doi:10.1007/s00425-012-805-9
Westermeier P, Wenzel G, Mohler V (2009) Development and evaluation of single-nucleotide polymorphism markers in allotetraploid rapeseed (Brassica napus L.). Theor Appl Genet 119(7):1301–1311
Wu X, Ren C, Joshi T, Vuong T, Xu D, Nguyen H (2010) SNP discovery by high-throughput sequencing in soybean. BMC Genom 11(1):469
Xu P, Wu X, Wang B, Liu Y, Ehlers JD, Close TJ, Roberts PA, Diop NN, Qin D, Hu T (2011) A SNP and SSR based genetic map of asparagus bean (Vigna unguiculata ssp. sesquipedialis) and comparison with the broader species. PloS One. doi:10.1371/journal.pone.0015952
Yan JB, Yang XH, Shah T, Sánchez-Villeda H, Li JS, Warburton M, Zhou Y, Crouch JH, Xu YB (2010) High-throughput SNP genotyping with the GoldenGate assay in maize. Mol Breed 25:441–451
Yang Z, Yoder AD (1999) Estimation of the transition/transversion rate bias and species sampling. J Mol Evol 48(3):274–283
Yang W, Bai X, Kabelka E, Eaton C, Kamoun S, vander Kannp E, Francis D (2004) Discovery of single nucleotide polymorphisms in Lycopersicon esculentum by computer aided analysis of expressed sequence tags. Mol Breed 14:21–34
Ye S, Dhillon S, Ke X, Collins AR, Day IN (2001) An efficient procedure for genotyping single nucleotide polymorphisms. Nucleic Acids Res 29:E88
Yuko A, Shiwa Y, Nagasaki H, Ebana K, Yoshikawa H, Yano M, Wakasa K (2011) Discovery of genome-wide DNA polymorphisms in a landrace cultivar of japonica rice by whole-genome sequencing. Plant Cell Physiol 52(2):274–282
Zhang J, Wu YT, Guo WZ (2000a) Fast screening of microsatellite markers in cotton with PAGE/silver staining. Acta Gossypii Sin 12:267–269
Zhang XR, Zhao YZ, Chen Y, Feng XY, Guo QY, Zhou MD, Hodgkin T (2000b) Establishment of sesame germplasm core collection in China. Genet Res Crop Evol 47(3):273–279
Zhang HY, Wei LB, Miao HM, Zhang TD, Wang CY (2012) Development and validation of genic-SSR markers in sesame by RNA-seq. BMC Genom 13:316
Zhang HY, Miao HM, Wei LB, Li CH, Zhao RH, Wang CY (2013) Genetic analysis and QTL mapping of seed coat color in sesame (Sesamum indicum L.). PLoS One. doi:10.1371/journal.pone.0063898
Acknowledgments
This work was supported by the National ‘973’ Project (Grant No. 2011C B109304), the earmarked fund for China Agriculture Research System (Grant No. CARS-15) and the National Natural Science Foundation of China (U1204318).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
11032_2014_174_MOESM4_ESM.jpg
Additional file 4 Comparisons of three groups of sesame transcriptomes. Contigs from each of the three cultivars were screened against the A. thaliana proteome. The non-redundant sequences of three accessions to the A. thaliana proteome are shown. (JPEG 62 kb)
11032_2014_174_MOESM5_ESM.tif
Additional file 5 Characteristics of the reference transcriptome. There are 229,862 non-ORF sequences and 37,646 ORF sequences in the reference transcriptome. Among the 37,646 ORF sequences, the total lengths of coding regions and non-coding regions reach 34.67 Mbp and 14.94 Mbp, respectively. (TIFF 522 kb)
Rights and permissions
About this article
Cite this article
Wei, L., Miao, H., Li, C. et al. Development of SNP and InDel markers via de novo transcriptome assembly in Sesamum indicum L.. Mol Breeding 34, 2205–2217 (2014). https://doi.org/10.1007/s11032-014-0174-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11032-014-0174-4
Keywords
- Sesame
- Single-nucleotide polymorphisms (SNPs)
- Insertions/deletions (InDels)
- Cultivar identification