Discovery of single nucleotide polymorphisms in Lycopersicon esculentum by computer aided analysis of expressed sequence tags
- Cite this article as:
- Yang, W., Bai, X., Kabelka, E. et al. Molecular Breeding (2004) 14: 21. doi:10.1023/B:MOLB.0000037992.03731.a5
- 192 Views
Single nucleotide polymorphisms (SNPs) are useful for characterizing allelic variation, for genome-wide mapping, and as a tool for marker-assisted selection. Discovery of SNPs through de novo sequencing is inefficient within cultivated tomato (Lycopersicon esculentum Mill.) because the polymorphism rate is more than ten-fold lower than the sequencing error rate. The availability of expressed sequence tag (EST) data has made it feasible to discover putative SNPs “in silico” prior to experimental verification. By exploiting redundancy among EST data available for different varieties among 148,373 tomato ESTs, we have identified candidate SNPs for use within cultivated germplasm pools. 1,245 contigs having three EST sequences of Rio Grande and three EST sequences of TA496 were used for SNP discovery. We detected 1 SNP for every 8,500 bases analyzed, with 101 candidate SNPs in 44 genes identified. Sixty-six SNPs could be recognized by restriction enzymes, and subsequent experimental verification using restriction digestion or CEL I digestion confirmed 83% of the putative polymorphisms tested. SNPs between TA496 and Rio Grande have a high probability (53%) of detecting polymorphisms between other L. esculentum varieties. Twenty-six SNPs in 18 unigenes were mapped to specific chromosomes. Two SNPs, LEOH23 and LEOH37, were shown to be linked to quantitative trait loci contributing to fruit color within elite breeding populations. These results suggest that the growing databases of DNA sequence will yield information that facilitates improvement within the germplasm pools that have contributed to productive modern varieties.