Abstract
Molecular markers are important tools for genotyping in genetic studies and molecular breeding. The SSR and SNP are two commonly used marker systems developed from genomic or transcript sequences. The objectives of this study were to: (1) assemble and annotate the publicly available ESTs in Arachis and the in-house short reads, (2) develop and validate SSR and SNP markers, and (3) investigate the genetic diversity and population structure of the peanut breeding lines and the U.S. peanut mini core collection using developed SSR markers. An NCBI EST dataset with 252,951 sequences and an in-house 454 RNAseq dataset with 288,701 sequences were assembled separately after trimming. Transcript sequence comparison and phylogenetic analysis suggested that peanut is closer to cowpea and scarlet bean than to soybean, common bean and Medicago. From these two datasets, 6455 novel SSRs and 11,902 SNPs were identified. Of the discovered SSRs, 380 representing various SSR types were selected for PCR validation. The amplification rate was 89.2 %. Twenty-two (6.5 %) SSRs were polymorphic between at least one pair of four genotypes. Sanger sequencing of PCR products targeting 110 SNPs revealed 13 true SNPs between tetraploid genotypes and 193 homoeologous SNPs within genotypes. Eight out of the 22 polymorphic SSR markers were selected to evaluate the genetic diversity of Florida peanut breeding lines and the U.S. peanut mini core collection. This marker set demonstrated high discrimination power by displaying an average polymorphism information content value of 0.783, a combined probability of identity of 10−11, and a combined power of exclusion of 0.99991. The structure analysis revealed four sub-populations among the peanut accessions and lines evaluated. The results of this study enriched the peanut genomic resources, provided over 6000 novel SSR markers and the credentials for true peanut SNP marker development, and demonstrated the power of newly developed SSR markers in genotyping peanut germplasm and breeding materials.
Similar content being viewed by others
References
Anderson WF, Holbrook CC, Culbreath AK (1996) Screening the core collection for resistance to tomato spotted wilt virus. Peanut Sci 23:57–61
Barkley NA, Dean RE, Pittman RN, Wang ML, Holbrook CC, Pederson GA (2007) Genetic diversity of cultivated and wild-type peanuts evaluated with M13- tailed SSR markers and sequencing. Genet Res 89:93–106
Batley J, Barker G, O’Sullivan H, Edwards KJ, Edwards D (2003) Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data1. Plant Physiol 132:84–91
Belamkar V, Selvaraj MG, Ayers JL, Payton PR, Puppala N, Burow MD (2011) A first insight into population structure and linkage disequilibrium in the US peanut minicore collection. Genetica 139:411–429
Beute MK, Wynne JC, Emery DA (1976) Registration of NC 3033 peanut germplasm. Crop Sci 16:887
Bi Y, Liu W, Xia H, Su L, Zhao C, Wan S, Wang X (2010) EST sequencing and gene expression profiling of cultivated peanut (Arachis hypogaea L.). Genome 53:832–839
Botstein D, White RL, Skolnick M, Davis RW (1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32:314–331
Branch WD (1996) Registration of ‘Georgia Green’ peanut. Crop Sci 36:806
Cardle L, Ramsay L, Milbourne D, Macaulay M, Marshall D, Waugh R (2000) Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics 156:847–854
Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21:3674–3676
Dellaporta SL, Wood J, Hicks JB (1983) A plant DNA minipreparation: version II. Plant Mol Biol Rep 1:19–21
Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software structure: a simulation study. Mol Ecol 14:2611–2620
Falush D, Stephens M, Pritchard JK (2007) Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol Ecol Notes 7:574–578
Feng S, Wang X, Zhang X, Dang PM, Holbrook C, Culbreath AK, Wu Y, Guo B (2012) Peanut (Arachis hypogaea) expressed sequence tag project: progress and application. Comp Funct Genomics. doi:10.1155/2012/373768
Fischer S, Brunk BP, Chen F, Gao X, Harb OS, Iodice JB, Shanmugam D, Roos DS, Stoeckert CJ (2011) Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups. Curr Protoc Bioinform Chapter 6: Unit 6.12.1-19
Gadaleta A, Mangini G, Mulè G, Blanco A (2007) Characterization of dinucleotide and trinucleotide EST-derived microsatellites in the wheat genome. Euphytica 153:73–85
Gao L, Tang J, Li H, Jia J (2003) Analysis of microsatellites in major crops assessed by computational and experimental approaches. Mol Breed 12:245–261
Grivet LL, Glaszmann J-C, Vincentz MM, da Silva FF, Arruda PP (2003) ESTs as a source for sequence polymorphism discovery in sugarcane: example of the Adh genes. Theor Appl Genet 106:190–197
Guo B, Chen X, Dang P, Scully BT, Liang X, Holbrook CC, Yu J, Culbreath AK (2008) Peanut gene expression profiling in developing seeds at different reproduction stages during Aspergillus parasiticus infection. BMC Dev Biol 8:12
Guo B, Chen X, Hong Y, Liang X, Dang P, Brenneman T, Holbrook CC, Culbreath A (2009) Analysis of gene expression profiles in leaf tissues of cultivated peanuts and development of EST-SSR markers and gene discovery. Int J Plant Genomics. doi:10.1155/2009/715605
Guo BZ, Pandey M, He GH, Zhang XY, Liao BS, Culbreath A, Varshney R, Nwosu V, Wilson R, Stalker T (2013) Recent advances in molecular genetic linkage maps of cultivated peanut (Arachis hypogaea L.). Peanut Sci 40:95–106
Hall BG (2013) Building phylogenetic trees from molecular data with MEGA. Mol Biol Evol 30:1229–1235
He G, Meng R, Newman M, Gao G, Pittman RN, Prakash CS (2003) Microsatellites as DNA markers in cultivated peanut (Arachis hypogaea L.). BMC Plant Biol 3:3
Holbrook CC, Culbreath AK (2007) Registration of ‘Tifrunner’ Peanut. J Plant Registr 1:124
Holland MM, Parson W (2011) GeneMarker® HID: a reliable software tool for the analysis of forensic STR data. J Forensic Sci 56:29–35
Hopkins MS, Casa AM, Wang T, Mitchell SE, Dean RE, Kochert GD, Kresovich S (1999) Discovery and characterization of polymorphic simple sequence repeats (SSRs) in peanut. Crop Sci 39:1243–1247
Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9:868–877
Hyten DL, Song Q, Fickus EW, Quigley CV, Lim J, Choi I, Hwang E, Pastor-Corrales M, Cregan PB (2010) High-throughput SNP discovery and assay development in common bean. BMC Genom 11:475
Islam S, Haque MS, Emon RM, Islam MM, Begum SN (2012) Molecular characterization of wheat (Triticum aestivum L.) genotypes through SSR markers. Bangladesh. J Agric Res 37(3):389–398
Jamieson A (1994) The effectiveness of using co-dominant polymorphic allelic series for (1) checking pedigrees and (2) distinguishing full-sib pair members. Anim Genet 1:37–44
Kantety RV, La Rota M, Matthews DE, Sorrells ME (2002) Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol 48:501–510
Kirst M, Cordeiro CM, Rezende G, Grattapaglia D (2005) Power of microsatellite markers for fingerprinting and parentage analysis in Eucalyptus grandis breeding populations. J Hered 96:161–166
Koilkonda P, Sato S, Tabata S, Shirasawa K, Hirakawa H, Sakai H, Sasamoto S, Watanabe A, Wada T, Kishida Y, Tsuruoka H, Fujishiro T, Yamada M, Kohara M, Isobe S, Suzuki S, Hasegawa M, Kiyoshima H (2012) Large-scale development of expressed sequence tag-derived simple sequence repeat markers and diversity analysis in Arachis spp. Mol Breed 30:125–138
Kota R, Rudd S, Facius A, Kolesov G, Thiel T, Zhang H, Stein N, Mayer K, Graner A (2003) Snipping polymorphisms from large EST collections in barley (Hordeum vulgare L.). Mol Genet Genomics 270:24–33
Kottapalli KR, Burow MD, Burow G, Burke J, Puppala N (2007) Molecular characterization of the US peanut mini core collection using microsatellite markers. Crop Sci 47:1718–1727
Kumpatla SP, Mukhopadhyay S (2005) Mining and survey of simple sequence repeats in expressed sequence tags of dicotyledonous species. Genome 48:985–998
Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M, Karthikeyan AS, Lee CH, Nelson WD, Ploetz L, Singh S, Wensel A, Huala E (2012) The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucl Acids Res 40:D1202–D1210
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Li H, Durbin R (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26:589–595
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Li Y, Chen CY, Knapp SJ, Culbreath AK, Holbrook CC, Guo BZ (2011) Characterization of simple sequence repeat (SSR) markers and genetic relationships within cultivated peanut (Arachis hypogaea L.). Peanut Sci 38:1–10
Liang X, Chen X, Hong Y, Liu H, Zhou G, Li S, Guo B (2009) Utility of EST-derived SSR in cultivated peanut (Arachis hypogaea L.) and Arachis wild species. BMC Plant Biol 9:35
Liu L, Wu Y (2012) Development of a genome-wide multiple duplex-SSR protocol and its applications for the identification of selfed progeny in switchgrass. BMC Genom 13:522
Lopez C, Piégu B, Cooke R, Delseny M, Tohme J, Verdier V (2005) Using cDNA and genomic sequences as tools to develop SNP strategies in cassava (Manihot esculenta Crantz). Theor Appl Genet 110:425–431
Luo M, Dang P, Guo BZ, He G, Holbrook CC, Bausher MG, Lee RD (2005) Generation of expressed sequence tags (ESTs) for gene discovery and marker development in cultivated peanut. Crop Sci 45:346–353
Mallikariuna N, Varshney RK (2014) Molecular markers, genetic maps and QTLs for molecular breeding in peanut. In: Mallikariuna N, Varshney RK (eds) Genetics, genomics and breeding of peanuts. CRC Press, Boca Raton, pp 79–113
Milbourne D, Meyer R, Bradshaw J, Baird E, Bonar N, Provan J, Powell W, Waugh R (1997) Comparison of PCR-based marker systems for the analysis of genetic relationships in cultivated potato. Mol Breed 3:127–136
Min XJ, Butler G, Storms R, Tsang A (2005) OrfPredictor: predicting protein-coding regions in EST-derived sequences. Nucl Acids Res 33:W677–W680
Moretzsohn MC, Leoi L, Proite K, Guimarães PM, Leal-bertioli SCM, Gimenes MA, Martins WS, Valls JFM, Grattapaglia D, Bertioli DJ (2005) A microsatellite-based, gene-rich linkage map for the AA genome of Arachis (Fabaceae). Theor Appl Genet 111:1060–1071
Morgante M, Hanafey M, Powell W (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat Genet 30:194–200
Nagy E, Guo S, Khanal S, Taylor C, Ozias-Akins P, Stalker HT, Nielsen N (2010) Developing a high density molecular map of the A-genome species A. duranensis. In: proceedings of American Peanut Research and Education Society (APRES), 12–15th July, Florida, USA
Nagy S, Poczai P, Cernák I, Gorji AM, Hegeds G, Taller J (2012a) PICcalc: an online program to calculate polymorphic information content for molecular genetic studies. Biochem Genet 50:670–672
Nagy ED, Guo Y, Tang S, Bowers JE, Okashah RA, Taylor CA, Zhang D, Khanal S, Heesacker AF, Khalilian N, Farmer AD, Carrasquilla-Garcia N, Penmetsa RV, Cook D, Stalker HT, Nielsen N, Ozias-Akins P, Knapp SJ (2012b) A high-density genetic map of Arachis duranensis, a diploid ancestor of cultivated peanut. BMC Genom 13:469
Natarajan P, Parani M (2011) De novo assembly and transcriptome analysis of five major tissues of Jatropha curcas L. using GS FLX titanium platform of 454 pyrosequencing. BMC Genom 12:191
Nicot N, Chiquet V, Gandon B, Amilhat L, Legeai F, Leroy P, Bernard M, Sourdille P (2004) Study of simple sequence repeat (SSR) markers from wheat expressed sequence tags (ESTs). Theor Appl Genet 109:800–805
Paetkau D, Calvert W, Stirling I, Strobeck C (1995) Microsatellite analysis of population structure in Canadian polar bears. Mol Ecol 4:347–354
Pan YB (2006) Highly polymorphic microsatellite DNA markers for sugarcane germplasm evaluation and variety identity testing. Sugar Tech 8(4):246–256
Pandey MK, Gautami B, Jayakumar T, Sriswathi M, Upadhyaya HD, Varshney RK, Gowda MVC, Radhakrishnan T, Bertioli DJ, Knapp SJ, Cook DR (2012a) Highly informative genic and genomic SSR markers to facilitate molecular breeding in cultivated groundnut (Arachis hypogaea). Plant Breeding 131:139–147
Pandey MK, Nigam SN, Upadhyaya HD, Janila P, Varshney RK, Monyo E, Ozias-Akins P, Liang X, Guimarães P, Zhang X, Guo B, Cook DR, Michelmore R, Bertioli DJ (2012b) Advances in Arachis genomics for peanut improvement. Biotechnol Adv 30:639–651
Peakall R, Gilmore S, Keys W, Morgante M, Rafalski A (1998) Cross-species amplification of soybean (Glycine max) simple sequence repeats (SSRs) within the genus and other legume genera: implications for the transferability of SSRs in plants. Mol Biol Evol 15:1275–1287
Perrier X, Flori A, Bonnot F (2003) Data analysis methods. In: Hamon P, Seguin M, Perrier X, Glaszmann JC (eds) Genetic diversity of cultivated tropical plants. Science Publishers, Montpellier, pp 43–76
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959
Proite K, Leal-Bertioli SC, Bertioli DJ, Moretzsohn MC, Da Silva FR, Martins NF, Guimaraes PM (2007) ESTs from a wild Arachis species for gene discovery and marker development. BMC Plant Biol 7:7
Quackenbush J, Liang F, Holt I, Pertea G, Upton J (2000) The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucl Acids Res 28:141
Rabbani MA, Masood MS, Shinwari ZK, Yamaguchi-Shinozaki K (2010) Genetic analysis of basmati and non-basmati Pakistani rice (Oryza sativa L.) cultivars using microsatellite markers. Pak J Bot 42(4):2551–2564
Ren X, Jiang H, Yan Z, Chen Y, Zhou X, Huang L, Lei Y, Huang J, Yan L, Qi Y, Wei W, Liao B (2014) Genetic diversity and population structure of the major peanut (Arachis hypogaea L.) cultivars grown in China by SSR markers. Plos One 9(2):e88091
Rongwen J, Akkaya Bhagwat AA, Lavi U, Cregan PB (1995) The use of microsatellite DNA markers for soybean genotype identification. Theor Appl Genet 90:43–48
Rosenberg NA, Burke T, Elo K, Feldman MW, Freidlin PJ, Groenen MA, Hillel J, Mäki-Tanila A, Tixier-Boichard M, Vignal A, Wimmers K, Weigend S (2001) Empirical evaluation of genetic clustering methods using multilocus genotypes from 20 chicken breeds. Genetics 159:699–713
Rowland LJ, Alkharouf N, Darwish O, Ogden EL, Polashock JJ, Bassil NV, Main D (2012) Generation and analysis of blueberry transcriptome sequences from leaves, developing fruit, and flower buds from cold acclimation through deacclimation. BMC Plant Biol 12:46
Saha MC, Mian MA, Rouf Eujayl I, Zwonitzer JC, Wang L, May GD (2004) Tall fescue EST-SSR markers with transferability across several grass species. Theor Appl Genet 109:783–791
Schmid KJ, Sorensen TR, Stracke R, Torjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res 13(6):1250–1257
Schmidt M (2007) Transformation and functional genomics in legumes. International workshop on advances in arachis through genomics and biotechnology. Atlanta, Georgia
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang X, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA (2010) Genome sequence of the palaeopolyploid soybean. Nature 463:178–183
Schmutz J, McClean PE, Mamidi S, Wu GA, Cannon SB, Grimwood J, Jenkins J, Shu S, Song Q, Chavarro C, Torres-Torres M, Geffroy V, Moghaddam SM, Gao D, Abernathy B, Barry K, Blair M, Brick MA, Chovatia M, Gepts P, Goodstein DM, Gonzales M, Hellsten U, Hyten DL, Jia G, Kelly JD, Kudrna D, Lee R, Richard MMS, Miklas PN, Osorno JM, Rodrigues J, Thareau V, Urrea CA, Wang M, Yu Y, Zhang M, Wing RA, Cregan PB, Rokhsar DS, Jackson SA (2014) A reference genome for common bean and genome-wide analysis of dual domestications. Nat Genet 46:707–713
Siwach P, Jain S, Saini N, Chowdhury VK, Jain RK (2004) Allelic diversity among basmati and non-basmati long-grain Indica rice varieties using microsatellite markers. J Plant Biochem Biotechnol 13:25–32
Smith JSC, Chin ECL, Shu H, Smith OS, Wall SJ, Senior ML, Mitchell SE, Kresovich S, Ziegle J (1997) An evaluation of the utility of SSR loci as molecular markers in maize (Zea mays L.): comparisons with data from RFLPS and pedigree. Theor Appl Genet 95:163–173
Song GQ, Li MJ, Xiao H, Wang XJ, Tang RH, Xia H, Zhao CZ, Bi YP (2010) EST sequencing and SSR marker development from cultivated peanut (Arachis hypogaea L.). Electron J Biotechnol. doi:10.2225/vol13
Struss D, Plieske J (1998) The use of microsatellite markers for detection of genetic diversity in barley populations. Theor Appl Genet 97:308–315
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30:2725–2729
Thiel TT, Michalek WW, Varshney RR, Graner AA (2003) Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet 106:411–422
Thorvaldsdottir H, Robinson JT, Mesirov JP (2013) Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192
Tillman BL, Stalker HT (2009) Peanut. In: Vollmann J, Rajcan I (eds) Oil Crops, handbook of plant breeding 4. Springer, New York, pp 287–315
Upadhyaya HD (2005) Variability for drought resistance related traits in the mini core collection of peanut. Crop Sci 45:1432–1440
Varshney RK, Thiel T, Stein N, Langridge P, Graner A (2002) In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species. Cell Mol Biol Lett 7:537–546
Varshney RK, Mohan SM, Gaur PM, Gangarao NVPR, Pandey MK, Bohra A, Sawargaonkar SL, Chitikineni A, Kimurto PK, Janila P, Saxena KB, Fikre A, Sharma M, Rathore A, Pratap A, Tripathi S, Datta S, Chaturvedi SK, Mallikarjuna N, Anuradha G, Babbar A, Choudhary AK, Mhase MB, Bharadwaj C, Mannur DM, Harer PN, Guo B, Liang X, Nadarajan N, Gowda CLL (2013) Achievements and prospects of genomics-assisted breeding in three legume crops of the semi-arid tropics. Biotechnol Adv 31:1120–1134
Wang ML, Sukumaran S, Barkley NA, Chen Z, Chen CY, Guo B, Pittman RN, Stalker HT, Holbrook CC, Pederson GA, Yu J (2011) Population structure and marker-trait association analysis of the US peanut (Arachis hypogaea L.) mini-core collection. Theor Appl Genet 123:1307–1317
Yin D, Wang Y, Zhang X, Li H, Lu X, Zhang J, Zhang W, Chen S (2013) De novo assembly of the peanut (Arachis hypogaea L.) seed transcriptome revealed candidate transcripts for oil accumulation pathways. Plos One 8(9):e73767
You FM, Huo N, Gu YQ, Luo M, Ma Y, Hane D, Lazo GR, Dvorak J, Anderson OD (2008) BatchPrimer3: a high throughput web application for PCR and sequencing primer design. BMC Bioinform 9:253
Young ND, Debellé F, Oldroyd GED et al (2011) The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 480:520–524
Zhang J, Liang S, Duan J, Wang J, Chen S, Cheng Z, Zhang Q, Liang X, Li Y (2012) De novo assembly and characterisation of the transcriptome during seed development, and generation of genic-SSR markers in peanut (Arachis hypogaea L.). BMC Genom 13:90
Zhao Y, Prakash CS, He G (2012) Characterization and compilation of polymorphic simple sequence repeat (SSR) markers of peanut from public database. BMC Res Notes 5:362
Zhou X, Xia Y, Ren X, Chen Y, Huang L, Huang S, Liao B, Lei Y, Yan L, Jiang H (2014) Construction of a SNP-based genetic linkage map in cultivated peanut based on large scale marker development using next-generation double-digest restriction-site-associated DNA sequencing (ddRADseq). BMC Genom 15:351
Zimmer EA, Wen J (2012) Using nuclear gene data for plant phylogenetics: progress and prospects. Mol Phylogenet Evol 65(2):774–785
Acknowledgments
The research presented in this article was sponsored by the Florida Peanut Producers Association.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by S. Hohmann.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Peng, Z., Gallo, M., Tillman, B.L. et al. Molecular marker development from transcript sequences and germplasm evaluation for cultivated peanut (Arachis hypogaea L.). Mol Genet Genomics 291, 363–381 (2016). https://doi.org/10.1007/s00438-015-1115-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-015-1115-6