Introduction

Soybean (Glycine max L.) has an economic importance between leguminous plants. It is considered as a source of seed proteins and oil (Hrustić et al., 2006). The reduction of using insecticides and insecticide residues risk can increase profits the bearable levels of soybean insect resistance (Rowan et al., 1991). Soybean has one or more chemicals that consider the genetically controlled resistance factors. Molecular and biochemical characters of the resistance to insect or susceptible cultivars of soybean, as well as amount of genetic diversity among them, takes into account for many purposes for the breeding of soybean (Fahmy and Salama, 2002).

Soybean contains high levels of proteins, and eight essential amino acids such as lysine, arginine, and leucine (Messina, 1995). It also has high levels of fatty acids and appreciable levels of vitamins and minerals. Soybeans contain a huge amount of omega-6 fatty acid, linolenic acid, and isoflavones (daidzeinand genistein) (Chauhan et al., 2008). Soybean contains about 30% of soluble and insoluble carbohydrates (Rotundo and Westgate, 2009). Soybean proteins have the highest nutritional value for human food, being particularly high in lysine. For baking bread for diabetics, soybean flour is used. The soybean meals are indispensable sources of protein in fish, poultry, and livestock nutrition which obtained after defatted soybean flour and the extraction of oil (Popović et al., 2009). The reduction of developing breast cancer, ovarian, cervical, colon, and lung risk can be obtained by low amount of soybean isoflavones. Also, soybean contains asparaginase, which is considered as a therapeutically important protein used with other medicines in the acute lymphatic leukemia and melanosarcoma treatment.

For many years, the morphological traits, molecular and biochemical analysis have been performed to detect and assess in the phylogenetic relationships among soybean cultivars (Badr and Halawa, 2012; Barakat et al., 2013). Morphological characters may be not significantly distinguished and commonly demand plants growing to maturity before the identification. Also, the morphological traits are considered unsteady due to the environmental effects (Ghalmi et al., 2010).

Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) is extensively used to detect the phylogenetic relationships between plants. Many researchers recommended using SDS-PAGE as a rapid method to identify and characterize soybean (Maged and Shawkat, 2012).

Molecular markers play an important role in assessing the phylogenetic relationships between and within different species and cultivars. Deoxyribonucleic acid (DNA)-based markers were commonly used in studies of genetic diversity, comparative biology, morphological characters, environmental conditions, conservation, and phylogenic phenomena between plant species and cultivars (Haq et al., 2014). Many molecular markers have been used for cultivar identification and phylogenetic analysis (Bornet and Branchard, 2004; Semagn et al., 2006).

After the advent of PCR, the improvement of different molecular markers such as amplified fragment length polymorphism (AFLP), single-nucleotide polymorphism (SNP), inter simple sequence repeats (ISSRs), simple sequence repeats (SSR), and several new concepts were used. Different markers were used for the determination of the relationships between plant species by many researchers in tomato cultivars (Nawaz et al., 2015), Poaceae plants (Haq et al., 2016), and Citrullus colocynthis (Verma et al., 2017).

Start codon targeted (SCoT) polymorphism was considered as a novel molecular marker. This marker targets on short start codon ATG within plant genes and has been adduced by Collard and Mackill (2009). It was characterized by higher polymorphism and better marker resolvability than other DNA marker techniques like random amplified polymorphic DNAs (RAPD) and ISSR, therefore earning its popularity for its superiority (Gorji et al., 2011). SCoT markers have salutary properties such as easy to use, cheaper, faster, and including non-radioactive materials than other molecular markers (Mulpuri et al., 2013). SCoTs are more directly used in constructing marker-assisted breeding programs than RAPDs, ISSRs, and SSRs (Mulpuri et al., 2013). SCoT marker is concerned with the conserved start codon in plant genes or flanking short region of the ATG translation initiation (Collard and Mackill, 2009).

SCoT markers have been efficiently used for DNA fingerprinting of Tritordeum bergrothii L. Poaceae (Cabo et al., 2014). Different studies showed that start codon targeted marker has a large ability than other random primers for polymorphism identification and determination of genetic variations through species (Zeng et al., 2014; Tiwari et al., 2016). SCoT markers have been used in many crop plant species such as rice (Collard and Mackill, 2009), cowpea (Igwe et al., 2017), and Plantago (Rahimi et al., 2018).

The goal of this research is to evaluate the SCoT and protein banding pattern (SDS-PAGE) markers effectiveness to determine the phylogenetic relationships among six Egyptian soybean (Glycine max L.) cultivars.

Materials and methods

Plant material and growth conditions

Seeds of six Egyptian soybean (Glycine max L.) cultivars (Giza111, Giza21, Giza82, Giza35, Giza22, and Giza83) were kindly supplied from the Agricultural Research Center (ARC), Giza, Egypt.

DNA extraction

Genomic DNA was extracted and purified from young leaves of the cultivars by using Gene Jet Plant Genomic DNA Purification Mini Kit (Thermo-Scientific, K0791, made in Germany). The final concentration of DNA was adjusted to 50 ng/μl. All the DNA samples were stored at − 20 °C.

Primer design

Primers were designed from consensus sequences derived from the studies by Joshi et al., (1997) and Sawant et al., (1999). All primers were 18-mer and with GC content between 50 and 72% (Table 1). They were dissolved in sterilized water to a final concentration of 100 pM and kept at − 20 °C.

Table 1 Sequence of SCoT primers used in this study

SCoT PCR reaction and amplification conditions

Fifteen primers were used for the amplification of the genomic DNA to assess the phylogenetic relationships. The primers which gave prominent and reproducible bands were selected for the final amplification and data analysis. The primers which gave lesser number of loci were not included in final data analysis. The prominent band was counted as (1) for a presence of the band, whereas for the absence of the band counted as (0) for the phylogram reconstruction. The faint bands were excluded from the final data analysis. Polymorphic information content (PIC) values were calculated for each SCoT primers according to the formula: PIC = 1 − p2 − q2 (Ghislain et al., 1999), where p is a frequency of the present band and q is a frequency of the absent band. All samples were subjected to be differentiated using finally eleven SCoT primers (Table 1). PCR was optimized for testing the SCoT method. The final optimized protocol is reported here. All PCR reactions were performed within a total volume of 50 μl in 96-well plates using a PTC-100 Thermocycler (MJ Research Model PTC100). PCR reaction mixtures contained PCR buffer (Promega; 20 mM Tris-HCl (pH 8.4), 50 mM KCl), 1.5 mM MgCl2, 0.24 mM of each deoxyribonucleotide triphosphates, 5 U of Taq polymerase (Promega), and 0.8 μm of primer. Each reaction contained 25 ng of template DNA. A standard PCR protocol was used: an initial denaturation step at 94 °C for 3 min, followed by 35 cycles of 94 °C for 1 min, 50 °C for 1 min, and 72 °C for 2 min; the final extension at 72 °C for 5 min. All PCR amplification products were separated on 1.2% agarose gels in tris-HCl acetic acid Edita (TAE) buffer stained with ethidium bromide and visualized under UV light. The gel was photographed by gel documentation (Bio-Rad) and analyzed by Total Lab program to find out the molecular weight of each band and that to compare the presence and absence of the band among cultivars, and this data was imported in multi-variant statistical package (MVSP) to find the similarity matrix and dendrogram (UPGMA, suing Jaccard’s coefficient) which reflect the relationships among the studied cultivars.

SDS-PAGE

Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) was performed according to the method of Laemmli (1970) as modified by Studier (1973). Seed storage proteins (water soluble proteins) were extracted from seeds of soybean cultivars. Protein fractionations were performed exclusively on vertical slab gel (19.8 cm × 26.8 cm × 0.2 cm) using the electrophoresis apparatus manufactured by Cleaver, UK.

The images were captured by Digital camera (Sony, made in Japan) and transferred directly to the computer, and then the protein bands were analyzed by Total Lab program to find out the molecular mass of each band then compare the presence and absence of the band among studied cultivars, and these data were imported in multi-variant statistical package (MVSP) to find the similarity matrix and dendrogram (UPGMA, using Jaccardʼs coefficient) which reflect the relationships among the studied cultivars.

Results

Phylogenetic relationships of soybean cultivars as revealed by SCoT marker

The amplification results of the SCoT primers used in this investigation are presented in Table 1. The 11 primers produced good reproducible and scorable patterns and the amplification profiles were screened for the presence of polymorphisms among the studied soybean cultivars (Fig. 1). The SCoT fingerprints revealed characteristic banding patterns for each cultivar.

Fig. 1
figure 1

SCoT products on 1.2% agarose gel using different primers: SCoT 2, SCoT 4, SCoT 5, SCoT 8, SCoT 13, SCoT 22, SCoT 24, SCoT 26, SCoT 27, SCoT 34, and SCoT 36. Lane M: 1 kb ladder DNA marker. Lanes 1, 2, 3, 4, 5, and 6 represent cultivars: Giza 111, Giza21, Giza82, Giza 35 Giza 22, and Giza 83, respectively

Soybean cultivars Giza82 and Giza22 revealed positive and negative unique bands, respectively, with SCoT-5 (Table 3), and these bands could be used to distinguish the studied cultivars. A total of 106 bands were generated, among which 52 bands were polymorphic. The percentage of polymorphic bands ranged between 28.57% with SCoT-22 and 66.66% with SCoT-4 and 5 and total average percentage of polymorphism (49.11%) as reported in this study (Table 2).

Table 2 Total number of amplicons, monomorphic and polymorphic fragments, and percentage of polymorphism as revealed by SCoT markers between the examined soybean cultivars

The average polymorphism information content (PIC) of the SCoT markers was 0.44. The amplification profile (Fig. 1) revealed by the three SCoT primers SCoT-5, SCoT-2, and SCoT-4 yielded highly informative patterns based on calculated PIC (Table 2). These primers, with higher PIC values, have more potential for further study, allowing investigating more cultivars or sampling sites using a reduced number of primers. PIC ranged from 0.2 in SCoT-22 to 0.74 in SCoT-5 with an average of 0.44.

The highest similarity value 0.83 was recorded between the two cultivars Giza82 and Giza35, followed by 0.806 among Giza82 and Giza83; these values indicate that each two cultivars with high similarity were closely related. While the lowest value recorded was 0.646 among cultivars Giza82 and Giza21; this indicates that these two cultivars were distant cultivars genetically (Table 4). The dendrogram gave two main clusters; the first cluster includes the cultivars Giza111and Giza21, while the second cluster includes the cultivars Giza82, Giza35, Giza22, and Giza83. The second group was divided into two sub-cluster; the first sub-cluster includes cultivars (Giza82 and Giza35) while the second sub-cluster includes the cultivars Giza22 and Giza83 (Fig. 2).

Fig. 2
figure 2

Dendrogram for the six examined soybean cultivars constructed from the SCoT polymorphism data using Unweighted pair-group arithmetic (UPGMA) and similarity matrices computed according to Jaccardʼ coefficient

Seed storage proteins analysis of soybean cultivars as revealed by using SDS-PAGE

The SDS-PAGE for seed storage proteins (soluble proteins) was used to investigate the biochemical differences among the tested soybean cultivars (Table 5 and Fig. 3). The bands pattern indicates the differences among the tested cultivars in molecular weight and optical intensity of the bands. The results of SDS-PAGE revealed a total of 23 bands with molecular weight (Mw) ranging from 20 to 270 kilodaltons (kDa). The maximum number of bands (22) was recorded in cultivars Giza111 and Giza82, while the minimum number of bands (18) was recorded in cultivar Giza35. Only one negative unique band was detected in the cultivar Giza 35 (33 kDa). The band at Mw 150 kDa was observed only in the cultivars Giza111 and Giza82. While, the band with Mw 120 kDa was absent in cultivars Giza22and Giza35.

Fig. 3
figure 3

SDS-PAGE analysis of protein patterns of six Egyptian soybean cultivars. M: Protein marker. 1: Giza111. 2: Giza21. 3: Giza82. 4: Giza35. 5: Giza22. 6: Giza83

The results reported from seed storage proteins profiles were used to determine the relationships between studied cultivars. The similarity indices among these cultivars were estimated for each pair-wise group (Table 6 and Fig. 4). The similarity relationships based on protein analysis ranged from 0.905, 0.909, to 0.714 (Table 6). The highest similarity index (0.909) was recorded between Giza 111 and Giza 82, while the lowest similarity index (0.714) was recorded between Giza35 and Giza83. The dendrogram gave two main genetic clusters; the first cluster includes the cultivars Giza22 and Giza35, while the second cluster includes the cultivars Giza83, Giza 21, Giza 82, and Giza 111. The second cluster was further divided into two sub-clusters; the first sub-cluster includes cultivars (Giza83 and Giza21) while the second sub-cluster includes cultivars Giza82 and Giza111 (Fig. 4).

Fig. 4
figure 4

The dendrogram of the six examined soybean cultivars as constructed using protein patterns unweighted pair-group arithmetic (UPGMA) and similarity matrices computed according to Jaccardʼ coefficient

Genetic similarity analysis based on the collective data of SCoT marker and SDS-PAGE

The highest similarity value 0.827 was recorded between the two cultivars Giza82 and Giza35, followed by 0.813 among Giza22 and Giza83; this indicates that these two cultivars were closely related. While, the lowest value (0.681) was reported among the two cultivars Giza82 and Giza21; this indicates that these two cultivars were genetically distant (Table 7). The dendrogram gave two main genetic clusters; the first clusters comprise the cultivars Giza111 and Giza21, while the second cluster includes the cultivars Giza82, Giza35, Giza22, and Giza83. The second cluster was divided into two sub-clusters; the first sub-cluster includes cultivars (Giza82 and Giza35), while the second sub-cluster includes cultivars Giza22 and Giza83 (Fig. 5). The unweighted pair-group arithmetic (UPGMA) results of collective data are similar to the UPGMA of SCoT marker; this indicates that SCoT marker is a powerful marker for discrimination and identification of deferent cultivars.

Fig. 5
figure 5

The dendrogram of the six examined soybean cultivars constructed from SCoT polymorphism and protein patterns data using unweighted pair-group arithmetic (UPGMA) and similarity matrices computed according to Jaccardʼ coefficient

Discussion

Phylogenetic relationships of soybean cultivars as revealed by SCoT marker

Molecular markers are used for characterization of the phylogenetic relationships at genomic DNA level (Madhumati, 2014). The design of SCoT markers are very easy depending on a conserved sequence such as ATG that surrounding the start codon of translation (Xiong et al., 2011). Evaluation of SCoT markers has already been established in Elymus sibiricus and Vigna unguiculata (Zhang et al., 2015; Igwe et al., 2017), respectively. In the present study, there is no linkage between GC content of primer and the clarification of the banding profile was reported which is in harmony with Marilla and Scoles (1996). The percentage of polymorphic bands ranged between 28.57% with SCoT-22 and 66.66% with SCoT-4 and 5 as shown in Fig. 1 and Tables 2 and 3. Satya et al., (2013) obtained similar results in studying genetic diversity between 155 genotypes of Boehmeria nivea L. They used 24 SCoT markers, which produced 136 amplicons with 87.5% polymorphism. High polymorphism (49.11%) as reported in this study (Table 2) is in compliance with earlier investigations in soybean (Boerma and Mian, 1998) which reported (98.54% polymorphism) with RAPD markers. Fahmy and Salama (2002) observed 94% of genetic polymorphism with RAPD markers and then Barakat (2004) reported 90.84% of polymorphism in soybean cultivars by using RAPD markers. Also, our study is in harmony to other investigations such as in grape [the polymorphism was 93% (Guo et al., 2012)] and 97.10% in Atriplex halimus (El Framawy et al., 2016). Polymorphic information content (PIC) ranged from 0.2 in SCoT-22 to 0.74 in SCoT-5 with an average of 0.44 which is in agreement with PIC value obtained in another studies such as 0.72 by SSR marker (Ghose et al., 2013), while its value by RAPD was 0.38 Henuka et al., (2015). So, PIC parameters are used to investigate their usefulness in detecting the fingerprinting (Agarwal et al., 2018).

Table 3 Number of positive and negative unique SCoT Markers recorded in soybean cultivars

The highest similarity value 0.83 was recorded between the two cultivars Giza82 and Giza35, while the lowest value recorded was 0.646 among cultivars Giza82 and Giza21 (Table 4 and Fig. 2). Xiong et al., (2011) determined the phylogenetic relationships of cultivated peanut genotypes using SCoT marker, and they documented that not all genotypes related to the same variety were classified in the same group. Aboulila and Mansour (2017) studied the phylogenetic relationships among ten Hordeium vulgare genotypes using SCoT marker, and they reported that SCoT marker is an efficient technique for obtaining new fingerprint of Hordeium vulgare. Furthermore, Mohamed et al., (2017) used SCoT marker for the discrimination and identification of 14 wheat (Triticum aestivum L.) cultivars obtained from different location from North Africa and they classified these cultivars into many groups according to the resulting dendrogram.

Table 4 Similarity matrix among studied soybean cultivars as computed according to Jaccardʼ Coefficient as revealed by SCoT markers

Seed storage proteins analysis of soybean cultivars as revealed using SDS-PAGE

These protein profiles were used as a biochemical fingerprint for the tested six Egyptian soybean cultivars (Table 5 and Fig. 3). The low polymorphism of protein could be referred to the conservative nature of the seed proteins (Bonfitto et al., 1999). Low level of protein polymorphism was found in early ripening peach of Sinai (Mansour et al., 1998) and in mung bean cultivars (Hassan, 2001). Seed proteins of soybean have two types of proteins, trypsin inhibitor and lipoxygenase, that are responsible for the production of antibiosis in susceptible insects (Johnston et al., 1993). Trypsin inhibitor protein has Mw 20 kDa (Krishnan, 2001). In this investigation, the protein band at Mw 20 kDa may be a trypsin inhibitor, while the band at Mw 89 kDa (Fig. 1) may be lipoxygenase (Stejskal and Griga, 1995).

Table 5 SDS-PAGE analysis of protein patterns of soybean cultivars: lane 1: Giza111, lane 2: Giza21, lane 3: Giza82, lane 4: Giza35, lane 5: Giza22, and lane 6: Giza83

The highest similarity index (0.909) was found between Giza 111 and Giza 82, while the lowest similarity index (0.714) was found between Giza35 and Giza83 (Table 6 and Fig. 4). Our results were compared with Sofalian et al. (2015) results; they studied genetic variation among 17 soybean genotypes using seed storage proteins as biochemical marker, and they found that genotypes divided into four groups, proteins with low Mw (LMW-GS) and high Mw (HMW-GS) divided them into three groups.

Table 6 Similarity matrix among studied soybean cultivars as computed according to Jaccardʼ coefficient as revealed by protein markers

Genetic similarity analysis based on the collective data of SCoT marker and SDS-PAGE

Our results of genetic similarity analysis based on the collective data of SCoT marker and SDS-PAGE as observed in (Fig. 5 and Table 7) were in agreement with the result of Barakat (2004), who used RAPD and SDS-PAGE techniques to study the phylogenetic relationships of six Egyptian soybean cultivars.

Table 7 Similarity matrix among studied soybean cultivars as computed according to Jaccardʼ coefficient as revealed by SCoT and protein markers

Conclusion

It was found from genetic similarity based on the collective data of SCoT marker and SDS-PAGE that the highest similarity was recorded between Giza82 and Giza35, while the lowest similarity was recorded between Giza82 and Giza21. The genetic similarity results from collective data of both SCoT and SDS-PAGE techniques indicate that the SCoT markers are more viable for discrimination and identification of cultivars than SDS-PAGE.