Molecular Breeding

, Volume 30, Issue 1, pp 419–428 | Cite as

Single nucleotide polymorphism discovery in common bean

  • Thiago Lívio P. O. Souza
  • Everaldo G. de Barros
  • Claudia M. Bellato
  • Eun-Young Hwang
  • Perry B. Cregan
  • Marcial A. Pastor-Corrales
Article

Abstract

Single nucleotide polymorphisms (SNPs) were discovered in common bean (Phaseolus vulgaris L.) via resequencing of sequence-tagged sites (STSs) developed by PCR primers previously designed to soybean shotgun and bacterial artificial chromosome (BAC) end sequences, and by primers designed to common bean genes and microsatellite flanking regions. DNA fragments harboring SNPs were identified in single amplicons from six contrasting P. vulgaris genotypes of the Andean (Jalo EEP 558, G 19833, and AND 277) and Mesoamerican (BAT 93, DOR 364, and Rudá) gene pools. These genotypes are the parents of three common bean recombinant inbred line mapping populations. From an initial set of 1,880 PCR primer pairs tested, 265 robust STSs were obtained, which could be sequenced in each one of the six common bean genotypes. In the resulting 131,120 bp of aligned sequence, a total of 677 SNPs were identified, including 555 single-base changes (295 transitions and 260 transversions) and 122 small nucleotide insertions/deletions (indels). The frequency of SNPs was 5.16 SNPs/kb and the mean nucleotide diversity, expressed as Halushka’s theta, was 0.00226. This work represents one of the first efforts aimed at detecting SNPs in P. vulgaris. The SNPs identified should be an important resource for common bean geneticists and breeders for quantitative trait locus discovery, marker-assisted selection, and map-based cloning. These SNPS will be also useful for diversity analysis and microsynteny studies among legume species.

Keywords

DNA polymorphisms Genome variations Molecular markers PCR primers Phaseolus vulgaris Sequence-tagged sites 

Introduction

Common bean (Phaseolus vulgaris L.), which includes dry and snap beans, is the most important legume for direct human consumption. Dry and snap beans are grown and consumed worldwide but are particularly important in the Americas, especially Latin America, as well as in Africa and Asia. Dry beans are largely a subsistence crop used as a major source of dietary protein in these countries, as a complement to carbohydrate-rich sources such as rice, maize, and cassava. The common bean is also an important source of minerals, i.e., iron and zinc, and of certain vitamins. For this reason, dry beans in particular are becoming increasingly important as an ingredient of a healthy diet for the prevention and treatment of degenerative diseases such as high cholesterol, diabetes, and obesity (Wortmann et al. 1998; Broughton et al. 2003).

Effective molecular tools for genetic studies in the common bean are greatly needed. Single DNA base differences between homologous DNA fragments and small nucleotide insertions and deletions (indels)—collectively referred as single nucleotide polymorphisms (SNPs)—are highly desirable as molecular markers because they can be used for in-depth genetic analysis even among closely related genotypes.

SNPs can be used as biallelic and codominant DNA markers for a variety of tasks in crop improvement including genes and quantitative trait locus (QTL) discovery, assessment of genetic diversity, association analysis, and marker-assisted selection. SNP markers have two main advantages over other molecular markers: (1) they are the most abundant form of genetic variation within genomes (Zhu et al. 2003), and (2) a wide array of technologies have now been developed for high-throughput SNP analysis (Fan et al. 2006). In addition, SNPs are more stable than simple sequence repeat (SSR) markers because of their lower mutation rate. For this reason, SNPs are useful for population genetics and phylogeny studies (Jin et al. 2003). Another relevant characteristic of SNP markers is that they can be transferred between closely related species and utilized for microsynteny analysis.

Large numbers of SNP markers are currently available for many plant species including Arabidopsis thaliana (Jander et al. 2002; Schmid et al. 2003, 2005; Nordborg et al. 2005), maize (Zea mays L.) (Tenaillon et al. 2001; Ching et al. 2002; Wright et al. 2005), rice (Oryza sativa L.) (Feltus et al. 2004), barley (Hordeum vulgare L.) (Kanazin et. al. 2002; Rostoks et al. 2005), sorghum (Sorghum bicolor L.) (Hamblin et al. 2004), soybean (Glycine max L.) (Zhu et al. 2003; Choi et al. 2007), sugarbeet (Beta vulgaris L.) (Schneider et al. 2001), poplar (Populus trichocarpa Torr. & Gray) (Tuskan et al. 2006), apple (Malus domestica Borkh.) (Chagné et al. 2008), and Vitis spp. (Fernández et al. 2008). SNPs have also been detected in common bean. Gaitán-Solís et al. (2008) identified SNPs in P. vulgaris by comparing sequences from coding and non-coding regions obtained from the GenBank and genomic DNA. A sequence analysis was conducted of 10 genotypes of cultivated and wild beans belonging to the Mesoamerican and Andean genetic pools. For the 10 genotypes evaluated, a total of 20,964 bp were analyzed in each genotype and compared, resulting in the discovery of 372 SNPs in 41 distinct sequence-tagged sites (STSs).

More than 20,000 SNPs have been discovered in soybean by sequence analysis and comparison of STSs from a small set of diverse genotypes. An initial phase of a large-scale soybean SNP discovery effort resulted in the identification of 5,551 SNPs (Zhu et al. 2003; Choi et al. 2007). Since then, many thousands of soybean SNPs have been identified using the same approach (Hyten et al. 2009). Because of the relatively close relationship between soybean and common bean (Zhu et al. 2005), it has been proposed that many of the soybean-derived STS primers would be useful for amplifying single amplicons in the common bean genome.

In this work, we used available PCR primers designed to amplify soybean genes and bacterial artificial chromosome (BAC) end sequences for SNP discovery in common bean. This approach aimed to save time and money in the initial steps of primer development. In addition, we also used available PCR primers designed to common bean microsatellite flanking regions. A higher level of sequence polymorphism is present in non-coding DNA than in coding DNA sequences. For this reason, microsatellite regions could be a good source of SNPs assuming that microsatellites are genomic rather than genic (Zhu et al. 2003). Finally, we also used PCR primers designed to specific P. vulgaris genes.

Materials and methods

Plant material and DNA extraction

Six contrasting Andean (Jalo EEP 558, G 19833, and AND 277) and Mesoamerican (BAT 93, DOR 364, and Rudá) common bean genotypes were used for SNP discovery. These genotypes are the parents of three common bean RIL mapping populations: BAT 93 × Jalo EEP 558, DOR 364 × G 19833, and Rudá × AND 277 (Freyre et al. 1998; Geffroy et al. 1998; Kelly et al. 2003; Miklas et al. 2006). Seeds from Rudá and AND 277 were provided by the P. vulgaris Active Germplasm Bank of the BIOAGRO/UFV (Viçosa, MG, Brazil). Seeds from the other common bean lines were provided by the Active Germplasm Bank of the Beltsville Agricultural Research Center, USDA/ARS (Beltsville, MD, USA).

Genomic DNA was extracted from bulked leaf tissue of five plants of each common bean genotype using the DNeasy Plant Mini DNA extraction kit (Qiagen, Valencia, CA, USA), according to the manufacturer’s protocol.

Sources of PCR primers for STS amplification

Sequences of the soybean PCR primers were selected as described by Choi et al. (2007). Primer sequences can be obtained from the Beltsville Agricultural Research Center Soybean Map and SNP Database, at http://bfgl.anri.barc.usda.gov/soybean. Primer sets were randomly selected from among STS primers designed to soybean genes and BAC end sequences.

Common bean PCR primer sets designed to microsatellite flanking regions were obtained from published data available as of August 2008 on the Bean Improvement Cooperative website (http://www.css.msu.edu/bic/Genetics.cfm). With the aim of increasing the number of available markers, SSR primers were selected that had not been mapped on the common bean core mapping population (BAT 93 × Jalo EEP 558) as they did not show size polymorphisms between the parent cultivars.

PCR primers were also designed to amplify common bean gene sequences associated with important agronomic traits. Full-length gene sequences randomly selected from GenBank (http://www.ncbi.nlm.nih.gov) were also used as templates to design STS primers. Primers were designed using the software Primer3 (Whitehead Inst., Cambridge, MA, USA) and Array Designer 2 (Premier Biosoft International, Palo Alto, CA, USA) with predicted PCR product lengths from 400 to 800 bp.

Preliminary test of PCR primers and DNA sequence analysis

All primer pairs were initially tested via PCR and agarose gel analysis aimed at identifying those pairs producing single amplicons. Amplification reactions were performed with 30 ng of genomic DNA from cultivar Jalo EEP 558, 1.0 μM of forward and reverse primers, 1.0 μl of 2× BioLine B PCR buffer (BioLine USA Inc., Taunton, MA, USA), 1.5 mM MgCl2, and 2 U of Taq DNA polymerase in a 10 μl reaction volume. DNA amplification assays consisted of: one initial denaturation step at 94°C for 5 min; 30 cycles at 94°C for 45 s, 58°C (soybean primers) or 54°C (common bean primers) for 1 min, and 72°C for 1 min; and a final step at 72°C for 7 min. The amplified products were visualized under UV light after electrophoresis on 1.5% agarose gels containing ethidium bromide (0.2 μg/ml), and immersed in a 1× sodium boric acid (SB) buffer (10 mM sodium hydroxide, pH adjusted to 8.5 with boric acid). Reactions with soybean primers that gave no products were reamplified using a 48°C annealing temperature. The primer sets that amplified a single discrete product were selected for DNA sequence analysis.

The single amplicons from selected PCR primer pairs were prepared for sequence analysis by treatment with 4 U of shrimp alkaline phosphatase (SAP) and 4 U of exonuclease I (ExoI) incubated at 37°C for 1 h followed by 72°C for 15 min to inactivate the enzymes. Labeling reactions were performed with 1 μl of PCR product, 1.0 μl (4 U) of BigDye Terminator version 3.1 (Applied Biosystems, Foster City, CA, USA), 1.5 μM of one of the original PCR primers (forward and reverse primers in separate reactions), 0.8 μl of 10× Promega Taq DNA polymerase buffer (Promega, Madison, WI, USA), and 1.75 mM of MgCl2 in a 10 μl reaction volume. Labeling reaction cycling conditions were as follows: one initial denaturation step at 90°C for 30 s; 40 cycles at 90°C for 10 s, 50°C for 5 s, and 60°C for 1 min. The PCR products were sequenced from both ends and the resulting termination products were analyzed on an ABI 3730 DNA Analyzer. The two resulting sequence traces derived from opposite ends of each amplicon were analyzed and aligned with standard DNA analysis software Phred (Ewing and Green 1998) and Phrap (http://www.phrap.org/). Resulting alignments and trace data were visually inspected in the Consed viewer (Gordon et al. 1998) to distinguish between those amplicons that were locus-specific and those that apparently resulted from amplification of two or more loci. The primer pairs that produced single amplicons with good quality sequence data were used for PCR amplification and sequencing of genomic DNA of the other five common bean genotypes: BAT 93, DOR 364, G19833, Rudá, and AND277. Resulting amplicons were treated with SAP/ExoI, sequenced and analyzed as described for Jalo EEP 558. Forward and reverse sequence traces from the six genotypes were aligned.

STS verification and SNP discovery

SNP discovery was carried out by sequence alignment using the SNP-PHAGE software (Matukumalli et al. 2006). Sequence traces for each putative SNP identified were visually inspected to confirm the sequence polymorphisms. Single-DNA base changes and indels present in each alignment were recorded as described by Matukumalli et al. (2006).

Nucleotide diversity

Nucleotide diversity (θ) was estimated according to Halushka et al. (1999):
$$ \theta = K/aL $$
$$ a = \sum\limits_{i = 2}^{n} {{1 \mathord{\left/ {\vphantom {1 {(i - 1)}}} \right. \kern-\nulldelimiterspace} {(i - 1)}}} $$
where K is the number of SNPs identified in an alignment of n genotypes, L is the total length of aligned sequences in bp, and \( a = {{\sum 1 } \mathord{\left/ {\vphantom {{\sum 1 } {\left( {i - 1} \right)}}} \right. \kern-\nulldelimiterspace} {\left( {i - 1} \right)}} \), with i ranging from 2 to n.

Results

A total of 1,499 soybean STS primer pairs were tested (Table 1). Of these, 513 (34.22%) amplified a single PCR product in the common bean genome. A total of 128 (8.54%) of the soybean STS primer pairs produced high-quality sequence data. The total length of aligned sequences was 66,085 bp and the mean length of the amplicons was 516 bp. Two hundred and seventy-seven SNPs were identified in 81 (5.40%) of the DNA fragments. The frequency of SNPs was 4.19 SNPs/kb and the nucleotide diversity expressed as Halushka’s theta (θ) was 0.00184 (Table 1).
Table 1

Detection of SNPs in common bean DNA fragments generated by amplification using soybean-derived STS primers

 

Primers designed to shotgun sequences

Primers designed to BAC end sequences

Total

No.

% of total

No.

% of total

No.

% of total

Primer pairs testeda

731

 

768

 

1,499

 

PCR and agarose gel analysis

Primer pairs producing no product

49

6.70

246

32.03

295

19.68

Primer pairs producing multiple bands

463

63.34

228

29.69

691

46.10

Primer pairs producing a single bandb

219

29.96

294

38.28

513

34.22

DNA sequence analysis

Multiple amplicons

10

1.37

7

0.91

17

1.13

Single amplicon (STS)

73

9.97

55

7.16

128

8.54

Poor or no sequence data

136

18.62

232

30.21

368

24.55

Fragments with at least one SNP

46

6.29

35

4.56

81

5.40

Length of aligned sequence

Total (bp)

40,121

 

25,964

 

66,085

 

Mean STS length (bp)

550

 

472

 

516

 

SNPs

Total

119

 

158

 

277

 

Frequency (SNPs/kb)

2.97

 

6.08

 

4.19

 

Nucleotide diversity (θc × 1,000)

1.30

 

2.67

 

1.84

 

Number of PCR primer pairs tested and results of PCR and sequence analysis in six genotypes of Phaseolus vulgaris

aThe primer pairs were initially used to amplify the DNA of the common bean cultivar Jalo EEP 558 followed by DNA sequence analysis of the resulting amplicon. When high-quality sequence data were obtained, the STS primers were used to amplify and sequence genomic DNA of the other five genotypes that are the parents of three mapping populations: AND 277, BAT 93, DOR 364, G 19833, and Rudá

bPrimer pairs that amplified a single band at 58° or 48°C (annealing temperature)

c\( \theta = K/aL \); where K is the number of SNPs identified in an alignment of n genotypes, L is the total length of aligned sequences in bp, and \( a = {{\sum 1 } \mathord{\left/ {\vphantom {{\sum 1 } {\left( {i - 1} \right)}}} \right. \kern-\nulldelimiterspace} {\left( {i - 1} \right)}} \), with i ranging from 2 to n

Of the 168 PCR primer pairs designed to common bean gene sequences, 109 (64.88%) amplified a single discrete amplicon and 66 (39.29%) yielded high-quality DNA sequences. The total length of aligned sequences was 38,167 bp and the mean length of STSs was 578 bp. A total of 237 SNPs were identified in 48 (28.57%) distinct DNA fragments. The frequency of SNPs was 6.21 SNPs/kb and the nucleotide diversity was θ = 0.00272 (Table 2).
Table 2

Detection of SNPs in common bean DNA fragments generated by amplification using primers designed to common bean gene sequences

 

Primers designed to selected candidate gene sequences

Primers designed to random gene sequences (GenBank)

Total

No.

% of total

No.

% of total

No.

% of total

Primer pairs testeda

72

 

96

 

168

 

PCR and agarose gel analysis

Primer pairs producing no product

14

19.45

36

37.50

50

29.76

Primer pairs producing multiple bands

6

08.33

3

03.12

9

5.36

Primer pairs producing a single bandb

52

72.22

57

59.38

109

64.88

DNA sequence analysis

Multiple amplicons

3

04.17

2

02.08

5

2.97

Single amplicon (STS)

25

34.72

41

42.71

66

39.29

Poor or no sequence data

24

33.33

14

14.59

38

22.62

Fragments with at least one SNP

16

22.22

32

33.33

48

28.57

Length of aligned sequence

Total (bp)

13,549

 

24,618

 

38,167

 

Mean STS length (bp)

542

 

600

 

578

 

SNPs

Total

98

 

139

 

237

 

Frequency (SNPs/kb)

7.23

 

5.65

 

6.21

 

Nucleotide diversity (θc × 1,000)

3.17

 

2.47

 

2.72

 

Number of PCR primer pairs tested and results of PCR and sequence analysis in six genotypes of Phaseolus vulgaris

aThe primer pairs were initially used to amplify the DNA of the common bean cultivar Jalo EEP 558 followed by DNA sequence analysis of the resulting amplicon. When high-quality sequence data were obtained, the STS primers were used to amplify and sequence genomic DNA of the other five genotypes that are parents of three mapping populations: AND 277, BAT 93, DOR 364, G 19833, and Rudá

bPrimer pairs that amplified a single band at 54°C (annealing temperature)

c\( \theta = K/aL \); where K is the number of SNPs identified in an alignment of n genotypes, L is the total length of aligned sequences in bp, and \( a = {{\sum 1 } \mathord{\left/ {\vphantom {{\sum 1 } {\left( {i - 1} \right)}}} \right. \kern-\nulldelimiterspace} {\left( {i - 1} \right)}} \), with i ranging from 2 to n

Seven hundred and fifty-eight P. vulgaris SSR primer pairs were tested. Of these, 477 (62.93%) amplified a single DNA band. Because the DNA sequencing platform used (ABI 3730) did not efficiently sequence small amplicons (< 200 bp) but only bands > 200 bp, 213 primer pairs (28.10%) were selected for DNA sequence analysis. Out of 213 bands, 71 (33.33%) produced high-quality sequencing data. The total length of aligned sequences was 26,868 bp and the mean length of single amplicons was 378 bp. One hundred and sixty-three SNPs were identified in 44 (20.66%) distinct fragments. The frequency of SNPs was 6.07 SNPs/kb and the nucleotide diversity was θ = 0.00266 (Table 3).
Table 3

Detection of SNPs in common bean DNA fragments generated by PCR primers designed to common bean microsatellite flanking regions

 

Primers designed to microsatellite flanking regions

No.

% of total

Primer pairs testeda

758

 

PCR and agarose gel analysis

Primer pairs producing no product

110

14.51

Primer pairs producing multiple bands

171

22.56

Primer pairs producing a single bandb

477

62.93

Primer pairs producing a band > 200 bp

213

28.10

Primer pairs tested (band > 200 bp)

213

 

DNA sequence analysis

Multiple amplicons

8

3.76

Poor or no sequence data

134

62.91

Single amplicon (STS)

71

33.33

Fragments with at least one SNP

44

20.66

Length of aligned sequence

Total (bp)

26,868

 

Mean STS length (bp)

378

 

SNPs

Total

163

 

Frequency (SNPs/kb)

6.07

 

Nucleotide diversity (θc × 1,000)

2.66

 

Number of PCR primer pairs, results of PCR and sequence analysis in six genotypes of Phaseolus vulgaris

aThe primer pairs were initially used to amplify the DNA of the common bean cultivar Jalo EEP 558 followed by DNA sequence analysis of the resulting amplicon. When high-quality sequence data were obtained, the STS primers were used to amplify and sequence genomic DNA of the other five genotypes that are parents of three mapping populations: AND 277, BAT 93, DOR 364, G 19833, and Rudá

bPrimer pairs that amplified a single band at 54°C (annealing temperature)

c\( \theta = K/aL \); where K is the number of SNPs identified in an alignment of n genotypes, L is the total length of aligned sequences in bp, and \( a = {{\sum 1 } \mathord{\left/ {\vphantom {{\sum 1 } {\left( {i - 1} \right)}}} \right. \kern-\nulldelimiterspace} {\left( {i - 1} \right)}} \), with i ranging from 2 to n

The results obtained using all 1,880 distinct PCR primer pairs are summarized in Table 4. The aligned sequence of the 265 (14.10%) STSs resulted in the discovery of 677 SNPs in 173 (9.20%) distinct DNA fragments. The final length of aligned sequences was 131,120 bp and the mean length of the amplicons was 495 bp. The overall frequency of SNPs was 5.16 SNPs/kb and the mean nucleotide diversity was θ = 0.00226.
Table 4

Summary of SNP discovery in Phaseolus vulgaris DNA fragments generated by common bean-derived and soybean-derived PCR primers

 

Source of primersa

Total

Soybean STSs bean gene sequences

Common

Common bean SSRs

No. of tested primers

1,499

168

213b

1,880

No. of single amplicons—STS (%)

128 (8.54)

66 (39.29)

71 (33.33)

265 (14.10)

Fragments with at least one SNP (%)

81 (5.40)

48 (28.57)

44 (20.66)

173 (9.20)

Sequence length (bp)

66,085

38,167

26,868

131,120

Mean STS length (bp)

516

578

378

495

No. of SNPs

277

237

163

677

SNP frequency (SNPs/kb)

4.19

6.21

6.07

5.16

Nucleotide diversity (θ × 1,000)

1.84

2.72

2.66

2.26

aSee Tables 1, 2, and 3 for more details

bPrimer pairs producing a single band > 200 bp selected from a total of 758 tested common bean SSR primers to fit the requirement of the DNA sequencing platform utilized (ABI 3730)

The SNP classes are shown in Table 5. Of the 277 SNPs identified in DNA fragments generated with soybean STS primers, 123 (44.40%) are transitions (A↔G and C↔T), 118 (42.60%) are transversions (A↔C, A↔T, C↔G, and G↔T), and 36 (13.00%) are indels. Among the 237 SNPs found in amplicons produced by primers designed to common bean genes, 109 (46.00%) are transitions, 88 (37.13%) are transversions, and 40 (16.87%) indels were detected. Of the 163 SNPs identified in PCR products generated by common bean SSR primers, 63 (38.65%) are transitions, 54 (33.13%) are transversions, and 46 (28.22%) are indels. Of a total of 677 SNPs identified in this work, 295 (43.57%) are transitions, 260 (38.41%) transversions, and 122 (18.02%) indels.
Table 5

Characteristics of SNPs identified in Phaseolus vulgaris DNA fragments generated by common bean-derived and soybean-derived PCR primers

Source of primersa

SNPs

Single-base changes

Indelsd

Transitionsb

Transversionsc

No.

% of total

No.

% of total

No.

% of total

Soybean STSs

277

123

44.40

118

42.60

36

13.00

Common bean genes

237

109

46.00

88

37.13

40

16.87

Common bean SSRs

163

63

38.65

54

33.13

46

28.22

Total

677

295

43.57

260

38.41

122

18.02

aSee Tables 1, 2, and 3 for more details

bA↔G and C↔T

cA↔C, A↔T, C↔G, and G↔T

dSmall nucleotide insertions and deletions

Discussion

As part of an international effort led by the Soybean Genomics and Improvement Laboratory (SGIL), USDA/ARS, aiming to develop SNP markers for P. vulgaris, primers designed to soybean genes and BAC end sequences and to common bean genes and SSR flanking regions were examined for locus-specific PCR amplification followed by resequencing of the resulting amplicons in a set of six diverse common bean genotypes. The use of available PCR primers (soybean STSs and common bean SSR primers) aimed to save time and money in the initial steps of primer development. However, in the case of the soybean-derived primers, only 5.40% produced DNA fragments with at least one SNP (Table 1). Despite the advantages mentioned above and the relatively close relationship of the soybean and common bean genomes (Zhu et al. 2005), the use of soybean-derived primers was not as productive as originally anticipated. The low rate of SNP discovery reported here was, largely, the result of the inability to develop robust STSs using these primers (Table 1).

The recent development of a large number of common bean SSR markers provided a valuable resource of STSs for SNP discovery. Of the SSR primer sets available as of August 2008 and with sequences deposited in the Bean Improvement Cooperative website (http://www.css.msu.edu/bic/Genetics.cfm), we selected those that had not been mapped in the bean core mapping population because they were not polymorphic. Our goal with this selection was to generate information useful for increasing the saturation of the P. vulgaris core genetic map. Of a total of 785 SSR primer pairs tested in this work, 62.93% were found to produce a single discrete PCR product. This high percentage was expected as the primers had been previously identified and/or validated in common bean. However, because the DNA sequencing platform (ABI 3730) used was not efficient in the sequence analysis of small amplicons, we were able to use only 28.10% (213) of the SSR primer sets (PCR products > 200 bp). These 213 SSR primer pairs were used for sequence analysis of the six contrasting P. vulgaris genotypes, and 20.66% produced single amplicons with good quality DNA sequence data and containing at least one SNP (Table 3).

Of a total of 168 primer sets designed to common bean gene sequences and examined for locus-specific PCR amplification followed by resequencing for SNP discovery, 28.57% were determined to contain a sequence variant (Table 2). The efficiency of the common-bean-derived PCR primers in producing SNP-containing amplicons was higher than that observed in soybean (21.50%) using the same strategy (Choi et al. 2007). The high level of duplication in the soybean genome complicates the development of robust STSs and may be the main explanation for this discrepancy.

The aligned sequence from all 265 STSs (131,120 bp) obtained from the six common bean genotypes resulted in the discovery of 677 SNPs in 173 DNA fragments (Table 4). Approximately 81.98% of the SNPs were single-base changes of which 43.57% were transitions and 38.41% transversions. The remaining 18.02% SNPs were indels (Table 5). The proportion of SNP classes detected here differs significantly from that previously reported in common bean (31.45% transitions, 35.75% transversions, and 32.80% indels) by Gaitán-Solís et al. (2008) and in soybean (55.70% transitions, 44.30% transversions, and 15.00% indels) by Choi et al. (2007).

The nucleotide diversity verified in this work (θ = 0.00226) in the 131,160 bp of common bean DNA sequence analyzed is approximately twice those detected in soybean by Zhu et al. (2003) (θ = 0.00097), Hyten et al. (2006) (θ = 0.00115), and Choi et al. (2007) (θ = 0.000997). However, a similar level of nucleotide diversity was verified in rice (θ = 0.00181) (Feltus et al. 2004), barley (θ = 0.0025) (Kanazin et al. 2002), and sorghum (θ = 0.0023) (Hamblin et al. 2004). Higher nucleotide diversity levels have also been reported in maize (θ = 0.00627) (Wright et al. 2005) and sugarbeet (θ = 0.0077) (Schneider et al. 2001). A high nucleotide diversity value (θ = 0.00627) was reported for common bean (Gaitán-Solís et al. 2008), similar to that detected in maize. This higher diversity level detected by these authors can be explained by the use of 10 highly diverse cultivated and wild bean genotypes from Mesoamerican and Andean origins.

It is important to highlight that the SNPs identified in the present work are not putative genome variations. They are real DNA polymorphisms that will be useful for genetic mapping, diversity analysis, and microsynteny studies among legume species. Additional information about these SNPs, i.e., their forward and reserve primers and respective SNP-containing amplicon sequences, are available in Online Resource 1. Currently, SNP discovery approaches using reduced representation libraries and high-throughput sequence analysis on the Illumina Genome Analyzer are being undertaken at the USDA, ARS, Beltsville, MD, USA. In the near future, analysis of the common bean SNPs developed in this work and others will be tested in the common bean germplasm using the SNP GoldenGate assay on the Ilumina BeadStation. In this assay it is possible to multiplex from 96 to 1,536 SNPs in a single reaction over a 3-day period (Fan et al. 2003). For this reason, it can be used for high-throughput SNP genotyping. The Illumina GoldenGate assay has been demonstrated to function well with the complex paleopolyploid genome of soybean (Hyten et al. 2008); it should therefore also work for the less complex genome of the common bean. These data will certainly boost the common bean genetic mapping efforts aimed at QTL discovery and map-based cloning, as well as the use of marker-assisted selection for cultivar development.

Notes

Acknowledgments

This work was supported by grants from United States Department of Agriculture/Agriculture Research Service—USDA/ARS (US Government). The first author was supported by a PhD fellowship from Conselho Nacional de Desenvolvimento Científico e Tecnológico—CNPq (Brazilian Government). The support from CNPq was greatly appreciated. We also thank Alicia Bertles from Bovine Functional Genomics Laboratory, BARC-East, USDA/ARS (Beltsville, MD, USA) for assistance with the DNA sequencing.

Supplementary material

11032_2011_9632_MOESM1_ESM.xls (653 kb)
Supplementary material 1 (XLS 653 kb)

References

  1. Broughton WJ, Hernandez G, Blair M, Beebe S, Gepts P, Vanderleyden J (2003) Beans (Phaseolus spp.)–model food legumes. Plant Soil 252:55–128CrossRefGoogle Scholar
  2. Chagné D, Gasic K, Crowhurst RN, Han Y, Bassett HC, Bowatte DR, Lawrence TJ, Rikkerink EHA, Gardiner SE, Korban SS (2008) Development of a set of SNP markers present in expressed genes of the apple. Genomics 92:353–358PubMedCrossRefGoogle Scholar
  3. Ching A, Caldwell KS, Jung M, Dolan M, Smith OS, Tingey S, Morgante M, Rafalski AJ (2002) SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines. BMC Genet 3:19PubMedCrossRefGoogle Scholar
  4. Choi IY, Hyten DL, Matukumalli LK, Song Q, Chaky JM, Quigley CV, Chase K, Lark KG, Reiter RS, Yoon MS, Hwang EY, Yi SI, Young ND, Shoemaker RC, VanTassell CP, Specht JE, Cregan PB (2007) A soybean transcript map: gene distribution, haplotype and SNP analysis. Genetics 176:685–696PubMedCrossRefGoogle Scholar
  5. Ewing B, Green P (1998) Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res 8:186–194PubMedGoogle Scholar
  6. Fan JB, Oliphant A, Shen R, Kermani BG, Garcia F, Gunderson KL, Hansen M, Steemers F, Butler SL, Deloukas P, Galver L, Hunt S, McBride C, Bibikova M, Rubano T, Chen J, Wickham E, Doucet D, Chang W, Campbell D, Zhang B, Kruglyak S, Bentley D, Haas J, Rigault P, Zhou L, Stuelpnagel J, Chee MS (2003) Highly parallel SNP genotyping. Cold Spring Harbor Symp Quant Biol 68:69–78PubMedCrossRefGoogle Scholar
  7. Fan JB, Chee MS, Gunderson KL (2006) Highly parallel genomic assays. Nat Rev Genet 7:632–644PubMedCrossRefGoogle Scholar
  8. Feltus FA, Wan J, Schulze SR, Estill JC, Jiang N, Paterson AH (2004) An SNP resource for rice genetics and breeding based on subspecies indica and japonica genome alignments. Genome Res 14:1812–1819PubMedCrossRefGoogle Scholar
  9. Fernández MP, Núñez Y, Ponz F, Hernáiz S, Gallego FJ, Ibáñez J (2008) Characterization of sequence polymorphisms from microsatellite flanking regions in Vitis spp. Mol Breed 22:455–465CrossRefGoogle Scholar
  10. Freyre R, Skroch PW, Geffroy V, Blondon AF, Shirmohamadali A, Johnson WC, Llaca V, Nodari RO, Pereira PA, Tsai SM, Tohme J, Dron M, Nienhuis J, Vallejos CE, Gepts P (1998) Towards an integrated linkage map of common bean. 4. Development of a core linkage map and alignment of RFLP maps. Theor Appl Genet 97:847–856CrossRefGoogle Scholar
  11. Gaitán-Solís E, Choi IY, Quigley C, Cregan P, Tohme J (2008) Single nucleotide polymorphisms in common bean: their discovery and genotyping using a multiplex detection system. Plant Genome 1:125–134CrossRefGoogle Scholar
  12. Geffroy V, Creusot F, Falquet J, S’Evignac J, Adam-Blonden AF, Bannerot H, Gepts P, Dron M (1998) A family of LRR sequences in the vicinity of the Co-2 locus for anthracnose resistance in Phaseolus vulgaris and its potential use in marker-assisted selection. Theor Appl Genet 96:494–502CrossRefGoogle Scholar
  13. Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Rese 8:195–202Google Scholar
  14. Halushka MK, Fan JB, Bentley K, Hsie L, Shen N, Weder A, Cooper R, Lipshutz R, Chakravarti A (1999) Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat Genet 22:239–247PubMedCrossRefGoogle Scholar
  15. Hamblin MT, Mitchell SE, White GM, Gallego J, Kukatla R, Wing RA, Paterson AH, Kresovich S (2004) Comparative population genetics of the panicoid grasses: sequence polymorphism, linkage disequilibrium and selection in a diverse sample of sorghum bicolor. Genetics 167:471–483PubMedCrossRefGoogle Scholar
  16. Hyten DL, Song Q, Zhu Y, Choi IY, Nelson RL, Costa JM, Specht JE, Shoemaker RC, Cregan PB (2006) Impacts of genetic bottlenecks on soybean genome diversity. Proc Natl Acad Sci USA 103:16666–16671PubMedCrossRefGoogle Scholar
  17. Hyten DL, Song Q, Choi IY, Yoon M, Specht JE, Matukumalli LK, Shoemaker RC, Young ND, Cregan PB (2008) High-throughput genotyping with the GoldenGate assay in the highly complex genome of soybean. Theor Appl Genet 116:945–952PubMedCrossRefGoogle Scholar
  18. Hyten DL, Song Q, Choi IY, Nelson RL, Carter TE, Specht JE, Cannon SB, Shoemaker RC, Pantalone VR, Cregan PB (2009) The accelerating pace of soybean genomics for marker development, quantitative trait loci discovery, and germplasm characterization. XVII Plant Animal Genome Conf http://www.intl-pag.org/17/abstracts/W70_PAGXVII_472.html. Accessed 16 Nov 2010
  19. Jander G, Norris SR, Rounsley SD, Bush DF, Levin IM, Last RL (2002) Arabidopsis map-based cloning in the post-genome era. Plant Physiol 129:440–450PubMedCrossRefGoogle Scholar
  20. Jin Q, Waters D, Cordeiro GM, Henry RJ, Reinke RF (2003) A single nucleotide polymorphism (SNP) marker linked to the fragrance gene in rice (Oryza sativa L.). Plant Sci 165:359–364CrossRefGoogle Scholar
  21. Kanazin V, Talbert H, See D, DeCamp P, Nevo E, Blake T (2002) Discovery and assay of single-nucleotide polymorphisms in barley (Hordeum vulgare). Plant Mol Biol 48:529–537PubMedCrossRefGoogle Scholar
  22. Kelly JD, Gepts P, Miklas PN, Coyne DP (2003) Tagging and mapping of genes and QTL and molecular marker-assisted selection for traits of economic importance in bean and cowpea. Field Crops Res 82:135–154CrossRefGoogle Scholar
  23. Matukumalli LK, Grefenstette JJ, Hyten DL, Choi IY, Cregan PB, VanTassell CP (2006) SNP-PHAGE: high throughput SNP discovery pipeline. BMC Bioinform 7:468–474CrossRefGoogle Scholar
  24. Miklas PN, Kelly JD, Beebe SE, Blair MW (2006) Common bean breeding for resistance against biotic and abiotic stresses: from classical to MAS breeding. Euphytica 147:105–131CrossRefGoogle Scholar
  25. Nordborg M, Hu TT, Ishino Y, Jhaveri J, Toomajian C, Zheng H, Bakker E, Calabrese P, Gladstone J, Goyal R, Jakobsson M, Kim S, Morozov Y, Padhukasahasram B, Plagnol V, Rosenberg NA, Shah C, Wall JD, Wang J, Zhao K, Kalbfleisch T, Schulz V, Kreitman M, Bergelson J (2005) The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol 3:e196PubMedCrossRefGoogle Scholar
  26. Rostoks N, Mudie S, Cardle L, Russell J, Ramsay L, Booth A, Svensson JT, Wanamaker SI, Walia H, Rodriguez EM, Hedley PE, Liu H, Morris J, Close TJ, Marshall DF, Waugh R (2005) Genome-wide SNP discovery and linkage analysis in barley based on genes responsive to abiotic stress. Mol Genet Genomics 274:515–527PubMedCrossRefGoogle Scholar
  27. Schmid KJ, Sorensen TR, Stracke R, Torjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genomewide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res 13:1250–1257PubMedCrossRefGoogle Scholar
  28. Schmid KJ, Ramos-Onsins S, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T (2005) A multilocus sequence survey in Arabidopsis thaliana reveals a genomewide departure from a neutral model of DNA sequence polymorphism. Genetics 169:1601–1615PubMedCrossRefGoogle Scholar
  29. Schneider K, Weisshaar B, Borchardt DC, Salamini F (2001) SNP frequency and allelic haplotype structure of Beta vulgaris expressed genes. Mol Breed 8:63–74CrossRefGoogle Scholar
  30. Tenaillon MI, Sawkins MC, Long AD, Gaut RL, Doebley JF (2001) Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc Natl Acad Sci USA 98:9161–9166PubMedCrossRefGoogle Scholar
  31. Tuskan GA, DiFazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, Schein J, Sterck L, Aerts A, Bhalerao RR, Bhalerao RP, Blaudez D, Boerjan W, Brun A, Brunner A, Busov V, Campbell M, Carlson J, Chalot M, Chapman J, Chen GL, Cooper D, Coutinho PM, Couturier J, Covert S, Cronk Q, Cunningham R, Davis J, Degroeve S, Déjardin A, Pamphilis C, Detter J, Dirks B, Dubchak I, Duplessis S, Ehlting J, Ellis B, Gendler K, Goodstein D, Gribskov M, Grimwood J, Groover A, Gunter L, Hamberger B, Heinze B, Helariutta Y, Henrissat B, Holligan D, Holt R, Huang W, Islam-Faridi N, Jones S, Jones-Rhoades M, Jorgensen R, Joshi C, Kangasjarvi J, Karlsson J, Kelleher C, Kirkpatrick R, Kirst M, Kohler A, Kalluri U, Larimer F, Leebens-Mack J, Leplé JC, Locascio P, Lou Y, Lucas S, Martin F, Montanini B, Napoli C, Nelson DR, Nelson C, Nieminen K, Nilsson O, Pereda V, Peter G, Philippe R, Pilate G, Poliakov A, Razumovskaya J, Richardson P, Rinaldi C, Ritland K, Rouzé P, Ryaboy D, Schmutz J, Schrader J, Segerman B, Shin H, Siddiqui A, Sterky F, Terry A, Tsai CJ, Uberbacher E, Unneberg P, Vahala J, Wall K, Wessler S, Yang G, Yin T, Douglas C, Marra M, Sandberg G, Van de Peer Y, Rokhsar D (2006) The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313:1596–1604PubMedCrossRefGoogle Scholar
  32. Wortmann CS, Kirkby RA, Eledu CKA, Allen DJ (1998) Atlas of common bean (Phaseolus vulgaris L.) production in Africa. Cali, CIAT, p 17Google Scholar
  33. Wright SI, Bi IV, Schroeder SG, Yamasaki M, Doebley JF, McMullen MD, Gaut BS (2005) The effects of artificial selection on the maize genome. Science 308:1310–1314PubMedCrossRefGoogle Scholar
  34. Zhu YL, Song QJ, Hyten DL, VanTassell CP, Matukumalli LK, Grimm DR, Hyatt SM, Fickus EW, Young ND, Cregan PB (2003) Single-nucleotide polymorphisms in soybean. Genetics 163:1123–1134PubMedGoogle Scholar
  35. Zhu H, Choi HK, Cook DR, Shoemaker RC (2005) Bridging model and crop legumes through comparative genomics. Plant Physiol 137:1189–1196PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  • Thiago Lívio P. O. Souza
    • 1
    • 3
  • Everaldo G. de Barros
    • 1
  • Claudia M. Bellato
    • 2
  • Eun-Young Hwang
    • 2
  • Perry B. Cregan
    • 2
  • Marcial A. Pastor-Corrales
    • 2
  1. 1.Instituto de Biotecnologia Aplicada à Agropecuária (BIOAGRO)Universidade Federal de Viçosa (UFV)ViçosaBrazil
  2. 2.Soybean Genomics and Improvement Laboratory, United States Department of AgricultureAgriculture Research Service (USDA/ARS), BARC-WestBeltsvilleUSA
  3. 3.Embrapa Arroz e FeijãoSanto Antônio de GoiásBrazil

Personalised recommendations