Introduction

Poaceae is a large family that includes the Triticeae tribe, this tribe has around 400–500 species. Genus Triticum L. exists as a polyploid species such as a diploid 2n = 2x = 14, a tetraploid 2n = 4x = 28, and a hexaploid 2n = 6x = 42 species, many of these species have economic importance as a food crop (Doebley et al. 2006). Triticum species genomes were designated as A, B, D, and G contributes to the genome constitution. Several types of analysis gave critical knowledge about the ancestry of the definite genomes in allopolyploid species (Zhang et al. 2002; Gu et al. 2004). It has been generally accepted that diploid wheat and Aegilops squarrosa L. (=syn. Ae. tauschii) (Goat grass) are donors of the A and D genomes, respectively (McFadden and Sears 1946). Many different species have been reported as the original donor of B and G genomes but it is now largely believed that the progenitor was a member of the Sitopsis section of the genus Aegilops, namely, Ae. bicornis, Ae. longissima, Ae. searsii or, most likely, Ae. Speltoides (Provan et al. 2004). Also Ae. speltoides was considered as the maternal donor of Triticum species (Dizkirici et al. 2016). There is a hypothesis that the B genome of polyploidy wheat is from a polyphyletic origin, i.e., it is a recombined genome derived from two or more diploid.

DNA sequence analysis techniques are considered as a modern approach in studying evolutionary relationship and biodiversity (Stoeckle 2003; Ferri et al. 2009). DNA barcoding technique has an essential role in the identification of species (Hebert et al. 2003) due to the small size of the DNA sequence with a high discriminatory power between the organisms. So, they play an important role in the identification of the plants having a problematic taxonomic identity for the biodiversity investigation and the identification of polymorphic plant species (Ajmal et al. 2014; Skuza et al. 2015).

There are many plant DNA barcodes such as rbcL, matK, trnH-psbA, and ITS (CBOL Plant Working Group 2009; China Plant BOL Group 2011; Li et al. 2015). The group of Consortium for the Barcode of Life (CBOL) recommended using a combination of two chloroplastic barcodes (matK and rbcL) as the standard plant DNA barcode supplemented with an additional barcode as required (CBOL Plant Working Group 2009).

The chloroplastic matK gene region (coding sequence) has a complete size with about 1500 bp that is translated into around 500 amino acid sequences for protein (maturase-like protein). The matK gene is one of the useful regions because it is the most rapidly evolving plastid gene, which provides sufficient information to identify the phylogenetic relationships at the intergeneric level (Young and dePamphilis 2000). MatK gene has a high rate of substitution compared with other genes used in grass systematics, also this gene has a large proportion of variation at the nucleic acid level at first and second codon position, low transition/transversion ratio and is characterized by the presence of mutationally conserved sectors. All these features of the matK gene are useful to determine the relationships of family and species (Liang and Hilu 1996).

The aim of this research is to investigate the genetic relationship among the following 20 different Triticum species: 3 diploid Triticum monococcum L. (einkorn wheat), 11 tetraploid species (one T. dicoccon subsp. dicoccon (emmer), 2 T. turgidum subsp. dicoccoides (wild emmer) and 8 T. turgidum subsp. durum (Desf.) (durum or macaroni wheat)) and 6 hexaploid T. aestivum (common wheat) were collected from different countries by using one type of DNA barcodes like matK gene and its translated amino acid sequence (151 amino acid) that form maturase K like protein.

Materials and methods

Plant materials

Twenty different Triticum species such as a diploid (Triticum monococcum L. AmAm), a tetraploid (Triticum turgidum subsp. dicoccoides, Triticum dicoccon subsp. dicoccon, and Triticum turgidum subsp. durum (Desf.) BBAuAu), and a hexaploid (Triticum aestivum BBAuAuDD) were obtained from International Center for Agricultural Research in the Dry Areas (ICARDA, Aleppo, Syria), Leibniz Institute of Plant Genetics and Crop Plant Research (IPK, Gatersleben, Germany), Agricultural Research Center (ARC, Giza, Egypt) and Egyptian National Gene Bank (Agricultural Research Center, Giza, Egypt) as mentioned in Table 1.

Table 1 The scientific and common name of 20 different Triticum species from a different country with code name and its GenBank accession numbers

Genomic DNA isolation

Genomic DNA was isolated from 100 mg young leaves samples using Gene Jet Plant Genomic DNA purification Mini Kits (Thermo scientific K0791). The extracted DNA was assessed by agarose gel electrophoresis and spectrophotometry (NanoDrop 2000; Thermo Scientific) and diluted to 50 ng/μl, then used as a template for PCR reaction (Golovnina et al. 2007).

MatK primer design

Seven matK gene sequences of different Triticum species were retrievable from the National Center for Biotechnology Information (NCBI) database (GenBank). The used sequences have accession numbers DQ420054.1 (T. monococcum, partial sequence), KC608185.1 (T. monococcum subsp. aegilopoides, partial sequence), KC608186.1 (T. monococcum subsp. aegilopoides, partial sequence), KC608208.1 (T. turgidum subsp. dicoccon, partial sequence), KC608210.1 (T. turgidum subsp. dicoccon, partial sequence), DQ420019.1 (T. aestivum, partial sequence), DQ420050.1 (T. aestivum, partial sequence), and AF164405.1 (T. aestivum, complete sequence). Then, the downloaded matK gene sequences with these accession numbers were saved in fasta files then aligned by mega program version 6. The sequence from base 123 to base 644 bp was commonly present in all aligned sequences with length about 521 bp, this part of the sequence was used for designing the matK primer using online program Primer 3 (version 4) (http://bioinfo.ut.ee/primer 3-0.4.0). The matK primer forward 5′-ACCTGTGGAAATAGTTGTTAGTTGT-3′ and reverse 5′-CCAATTCGAATAGTAGTTGAGAAAG-5′ was designed to amplify 454 bp only from matK sequence. After that, the designed primer was tested in silico by aligning the retrievable complete matK gene sequence of Triticum with the designed matK primer to ensure that this primer was already a specific matK primer and attached with Triticum matK gene by 100%.

PCR amplification

The PCR reaction was carried out in duplicate in a T100™ Thermal Cycler (Bio-Rad) in the final volume of 25 μl. The single PCR reaction mixture contained: 5× Taq Buffer, MgCl, 0.2 mM dNTP, 10 pM of each primer, 50 ng genomic DNA, and 1 U Go Taq DNA Polymerase (Promega, USA). The thermal profile used was 95 °C for 4 min followed by 35 cycles of 95 °C for 30 s, 57 °C for 1 min, and 72 °C for 1 min, and a final extension at 72 °C for 5 min. PCR products were checked by running on 1.5% agarose gel containing ethidium bromide in 1X TAE buffer (pH 8.0). The gel was analyzed and archived using the Molecular Imager® GelDoc™XR software. Bands were scored and analyzed with the Quantity One software (Bio-Rad). The size of the products was determined by comparison with 100 bp DNA Ladder H3 RTU (GeneDirex, cat no. DM003-R500). The sequences isolated in this paper have been deposited in the GenBank nucleotide sequence database (National Center for Biotechnology Information (NCBI)) under accession numbers MN047218, MN047219, MN047220, MN047221, MN047222, MN062364, MN062365, MN062366, MN062367, MN062368, MN062369, MN062370, MN062371, MN062372, MN062373, MN062374, MN062375, MN062376, MN062377, and MN062378 (Table 1).

Phylogenetic analysis

The chromatogram data were visualized by using the Bio-Edit program version 3 (Hall 1999). The nucleotide sequences were aligned with the Clustal W multiple sequence alignment program. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura 3-parameter model (Tamura 1992). Statistical support for each constructed tree was provided by two statistical data analysis as bootstrapping (1000 replications) and pairwise distance. Total nucleotide length (bp), estimates of evolutionary divergence between sequences, percentage of nucleotide composition and polymorphism estimation, maximum likelihood of substitution matrix and maximum likelihood of transition/transversion bias were calculated by MEGA 6.06 program (Tamura et al. 2013).

Amino acid sequence

The nucleotide sequences were translated into amino acid sequences by the ExPASy online program (https://web.expasy.org/translate) for each studied Triticum species. The amino acid sequences were aligned with the Clustal W multiple sequence alignment program to construct the phylogenetic tree. The evolutionary history was inferred by using the maximum likelihood method based on the Tamura 3-parameter model (Tamura 1992). Statistical support for each constructed tree was provided by two statistical data analysis as bootstrapping (1000 replications) and pairwise distance.

Results

The selected portion of the matK gene was successfully amplified, then amplicons were sequenced and deposited in the GenBank under accession numbers MN047218, MN047219, MN047220, MN047221, MN047222, MN062364, MN062365, MN062366, MN062367, MN062368, MN062369, MN062370, MN062371, MN062372, MN062373, MN062374, MN062375, MN062376, MN062377, and MN062378 for all studied Triticum species. (Fig. 1 and Table 1). The length of the amplified matK gene was about 454 bp in all studied samples (partial gene) with 189 monomorphic nucleotide positions and 265 polymorphic sites (58.37% polymorphism). The GC% content average was found to be around 35.3% in all tested samples.

Fig. 1
figure 1

PCR amplified matK fragments from 20 different Triticum species

The nucleotide sequences of all studied species were analyzed by Tamura (1992) model to estimate the rates of different transitional and transversional substitutions as shown in Table 2. Base substitution mutation is the base of single-nucleotide polymorphism (SNP) which is either involves a transition (pyrimidines/pyrimidines or purines/purines) or transversions (pyrimidines against purines or vice versa) exchange. The estimated transition/transversion bias (R) is 0.99. Substitution pattern and rates were estimated, the nucleotide frequencies are A = 32.34%, T/U = 32.34%, C = 17.66%, and G = 17.66%.

Table 2 Maximum likelihood estimate of substitution matrix

Molecular phylogenetic analysis based on DNA sequence of partial matK gene

The sequence of the chloroplast matK gene was deciphered to verify the phylogenetic relationships of studied Triticum species. The evolutionary history was conducted using the maximum likelihood method depending on the Tamura 3-parameter model by two statistical data analysis bootstrapping and pairwise distance. The two types of data analysis gave the same phylogenetic tree result (Figs. 2 and 3). The phylogenetic tree divided all studied sample (20 Triticum species) into two groups A and B. Group A (green color) represented the diploid Triticum species 2n = 2x = 14 (T. monococcum L. AmAm) with common name Einkorn collected from three different countries Iraq (IG 109083), Iran (IG 113259), and Syrian (IG 44936). This group was split into two subgroups: the first subgroup contained T. monococcum L. from Iran and Syrian while the second sub-group contained T. monococcum L. from Iraq only. Group B was split into two subgroups, I and II. Subgroup I represented the hexaploid Triticum species (red color) 2n = 6x = 42 (T. aestivum (BBAuAuDD)) and subgroup II represented the tetraploid species (blue color) 2n = 4x = 28 (T. turgidium subsp. dicoccoides BBAuAu (Wild emmer), T. dicoccon subsp. dicoccon BBAuAu (emmer) and T. turgidium subsp. durum BBAuAu (macaroni wheat)). It was observed from the subgroup I (red color) that T. aestivum from Indian (TRI 28936) and Libyan (TRI 13955) were closely related to each other while T. aestivum accessions (Egyptian cultivar, sids 4 and Egyptian landraces, Qena, Nag Hamad 27) were different from each other and from T. aestivum accessions collected from Indian and Libyan, also T. aestivum accessions (Egyptian cultivar, Giza 168 and Egyptian landraces, New Valley, Dakhla 7) were different from each other and from all other T. aestivum accessions. The subgroup II (blue color) was divided into two clusters. The first cluster was split into two subclusters, the first subcluster contained T. turgidium subsp. dicoccoides (wild emmer) from Syrian with code number IG 46467 and IG 46447, this indicated that these two species were closely related to each other while the second subcluster contained T. dicoccon subsp. dicoccon (emmer) from Ursprungsland (TRI 28920) only. The second cluster contained T. turgidium subsp. durum (Desf), this cluster was divided into two sub-cluster. The first subcluster contained T. turgidium subsp. durum from Turkey (TRI 28834) and Iran (TRI 19242), these two species were closely related to each other. The second subcluster was divided into two sections, the first section contained T. turgidium subsp. durum from Italian (TRI 27360 and 27284) which were closely related to each other. The second section was divided into two subsections; the first subsection contained T. turgidium subsp. durum from Egypt (TRI 19223) and Egyptian cultivar Sohag 4 while the second subsection contained T. turgidium subsp. durum Egyptian landraces from Sohag, Almonshaah 34 and Sohag, Almonshaah 41.

Fig. 2
figure 2

Molecular phylogenetic analysis of different Triticum species by bootstrapping analysis depending on nucleotide sequence of partial matK gene. The phylogenetic analysis was performed in MEGA program version 6 by maximum likelihood method depending on the Tamura 3-parameter model. The bootstrap consensus tree deduced from 1000 replicates is obtained to determine the evolutionary history of the species analyzed. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are excluded. The percentage of replicate trees in which the associated species clustered together in the bootstrap test (1000 replicates) are shown next to the branches (Felsenstein, 1985)

Fig. 3
figure 3

Molecular phylogenetic analysis of different Triticum species by pairwise distance analysis depending on nucleotide sequence of partial matK gene. The phylogenetic analysis was performed in MEGA program version 6 by maximum likelihood method depending on the Tamura 3-parameter model. The tree had the highest log likelihood (− 3290.1725). The tree’s percentage in which the associated species clustered together is shown next to the branches. Initial tree(s) for the heuristic search were performed automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances evaluated using the maximum composite likelihood (MCL) approach and then selecting the topology with superior log likelihood value. The tree is drawn to scale with branch lengths measured in the number of substitutions per site

Estimates of evolutionary divergence between sequences

The number of base substitutions per site was estimated between nucleotide sequences (454 bp) of all studied Triticum species as shown in Table 3. All ambiguous positions were removed for each sequence pair. Analyses were performed using the Tamura 3-parameter model with MEGA program version 6. It was observed from Table 3 and Figs. 2 and 3 that the highest evolutionary divergence of studied species was found between T. monococcum L. from Syrian IG 44936 and T. turgidium subsp. dicoccoides from Syrian IG 46447 was 0.37, this indicated that these two species were highly different. While the least evolutionary divergence between T. turgidium subsp. durum (Turkey TRI 28834 and Iran TRI 19242) and also between T. turgidium subsp. durum Egyptian landraces from Sohag, Almonshaah 34 and Sohag, Almonshaah 41 were 0.02, this indicated that every two species with high similarity. The evolutionary divergence between Egyptian Triticum aestivum cultivars (Giza 168 and sids 4) was 0.12 while The evolutionary divergence between Egyptian Triticum aestivum landraces (New Valley, Dakhla 7 and Qena, Nag Hamad 27) was 0.08, this indicated that the difference that was found between both Egyptian cultivars and Egyptian landraces was relatively low.

Table 3 Estimates of evolutionary divergence between sequences of 20 different Triticum species

Estimates of base composition bias difference between sequences

From the analysis of all nucleotide sequences, the difference in base composition bias per site was compute recorded in Table 4 (Kumar and Gadagkar 2001). Even when the substitution patterns are homogeneous among lineages, the compositional distance will correlate with the number of differences between sequences. It was observed from Table 4 and Figs. 2 and 3 that the highest compositional distance found between T. monococcum L. (Iran, IG 113259) and T. turgidium subsp. dicoccoides (Syrian, IG 46447) was 0.83. While T. turgidium subsp. durum from Italien (TRI 127360 and TRI 127284) and T. turgidium subsp. durum Egyptian landraces (Sohag, Almonshaah 34 and Sohag, Almonshaah 41) had not a compositional distance. The compositional distances between Egyptian Triticum aestivum cultivar sids 4 and two Egyptian Triticum aestivum landraces (New Valley, Dakhla 7 and Qena, Nag Hamad 27) were 0.07 and 0.09, respectively; this indicated that the composition distance between these two landraces and cultivar sids 4 was a very low value.

Table 4 Estimates of base composition bias difference between sequences

Molecular phylogenetic analysis based on amino acid sequence from partial matK gene translation

The translated amino acid sequences were used to detect the phylogenetic relationships between all studied species. The amino acid sequences of all studied species consist of 20 types of amino acid such as Alanine, cysteine, aspartic acid, glutamic acid, phenylalanine, glycine, histidine, isoleucine, lysine, leucine, methionine, asparagine, proline, glutamine, arginine, serine, threonine, valine, tryptophan, and tyrosine with a percentage average of its frequencies 1.06, 1.49, 4.77, 6.03, 8.44, 1.42, 4.11, 4.11, 4.44, 14.14, 1.26, 5.79, 6.62, 5.03, 4.74, 8.31, 0.73, 6.75, 1.16, and 6.19, respectively, as shown in Table 5.

Table 5 Types of amino acid and its percentage frequencies in each sample and the average of all 20 different Triticum species in amino acid sequence

The evolutionary history was conducted by using the Maximum Likelihood method based on the Tamura 3-parameter model by two statistical data analysis bootstrapping and pairwise distance analysis. The two types of data analysis gave different phylogenetic tree result (Figs. 4 and 5). The phylogenetic tree using bootstrapping analysis gave the same result based on the nucleotide sequences except for some differences that will be mentioned in the following context: group B was divided into two subgroups, I and II. Subgroup I (red color) was split into two clusters. The first cluster was divided into two subclusters, the first subcluster contained T. aestivum (Egyptian cultivar, sids 4 and Egyptian landraces, New Valley, Dakhla 7) while the second subcluster contained T. aestivum Egyptian cultivar, Giza 168 only. The second cluster was divided into two subclusters, and the first subcluster contained T. aestivum from Indian (TRI 28936) and Libyan (TRI 13955); this indicated that these two species were closely related to each other while the second subcluster contained T. aestivum Egyptian landraces, Qena, Nag Hamad 27 only. Subgroup II (blue color) was divided into two clusters. The first cluster was split into two subclusters, and the first subcluster contained T. turgidium subsp. dicoccoides (wild emmer) from Syrian with code number IG 46467 and IG 46447; this indicated that these two species were closely related to each other while the second subcluster contained T. dicoccon subsp. dicoccon (emmer) from Ursprungsland (TRI 28920) only. The second cluster contained T. turgidium subsp. durum (Desf), and this cluster was split into two subclusters based on bootstrapping analysis; the first subcluster was split into two sections. The first section was divided into two subsections, and the first section contained T. turgidium subsp. durum from Egypt (TRI 19223) and Egyptian cultivar Sohag 4 but the second section contained T. turgidium subsp. durum Egyptian landraces from Sohag, Almonshaah 34 and Sohag, Almonshaah 41 while the second subcluster was split into two sections. The first section was split into two subsections, the first subsection contained two T. turgidium subsp. durum species from Italian (TRI 27360 and 27284) which were closely related to each other, but the second subsection contained T. turgidium subsp. durum from Turkey (TRI 28834) and Iran (TRI 19242), these two species were closely related to each other. While the phylogenetic tree using pairwise distance analysis gave the same result obtained by bootstrapping analysis except some differences that will be mentioned in the following context: the second cluster from the subgroup II contained T. turgidium subsp. durum (Desf), this cluster was divided into two subclusters, the first subcluster contained T. turgidium subsp. durum from Turkey (TRI 28834) and Iran (TRI 19242), these two species were closely related to each other. The second subcluster consisted of two sections. The first section contained T. turgidium subsp. durum from Italian (TRI 27360 and 27284) which were closely related to each other. The second section was divided into two subsections, the first subsection contained T. turgidium subsp. durum from Egypt (TRI 19223) and Egyptian cultivar Sohag 4 but the second subsection contained T. turgidium subsp. durum Egyptian landraces from Sohag, Almonshaah 34, and Sohag, Almonshaah 41.

Fig. 4
figure 4

Molecular phylogenetic analysis of different Triticum species by bootstrapping analysis depending on amino acid sequence of maturase-like protein. The phylogenetic analysis was performed in MEGA program version 6 by maximum likelihood method depending on the JTT matrix-based model (Jones, et al. 1992). The bootstrap consensus tree deduced from 1000 replicates is obtained to represent the evolutionary history of the species analyzed. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are excluded (Felsenstein, 1985)

Fig. 5
figure 5

Molecular Phylogenetic analysis of different Triticum species by pairwise distance analysis depending on amino acid sequence of maturase-like protein. The phylogenetic analysis was performed in MEGA version 6 program by maximum likelihood method depending on the JTT matrix-based model (Jones, et al. 1992). The tree had the highest log likelihood (− 2539.1376). The tree’s percentage in which the associated species clustered together is shown next to the branches. Initial tree(s) for the heuristic search were performed automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model and then selecting the topology with superior log likelihood value. The tree is drawn to scale with branch lengths measured in the number of substitutions per site

Discussion

DNA barcoding considers a new method for discrimination between different species. According to the CBOL plant working group, DNA barcode must have a high efficiency for amplification and sequencing, also have a genetic variation that not only enables to distinguish sequences at the species level, but also it must be a conservative sequence among individuals of the same species (Hebert et al. 2003; Cowan et al. 2006; CBOL Plant Working Group 2009). The matK barcode contains high substitution rates within the species and is considering as an important candidate to documented plant systematics and evolution (Notredame et al. 2000). Savolainen et al. (2000) documented that the genetic relationships revealed by matK data are more robust than those obtained from combining rbcL and atpB sequences. Many studies indicated that Chloroplast matK barcode is an essential marker for discrimination of species or taxa (Newmaster and Ragupathy 2009; DeMattia et al. 2011), also this gene is used to resolve intergeneric or interspecific relations among flowering plants, such as Malpighiaceae, Poaceae, Nicotiana, Orchidaceae (Liang and Hilu 1996; Cameron et al. 2001; Salazar et al. 2003). The Plant Working Group (PWG) of the Consortium for the Barcoding of Life (CBOL) recommended that two regions of genes, rbcL and matK, could be adopted as the plant DNA barcode standard, and nuclear gene ITS as the supplement barcodes (CBOL Plant Wording Group 2009) while Dizkirici et al. (2016) investigated the phylogenetic relationships between different Triticum and Aegilops species by nuclear ITS and chloroplast matK genes, they found that the relationships between different polyploid wheat and Ae. speltoides species that obtained from both chloroplast matK and nuclear ITS sequences were the same, this ensured the idea of co-inheritance of nuclear and chloroplast genomes where Ae. speltoides was the maternal donor.

Our results showed that the partial region of the matK gene amplified and sequenced gave high polymorphism between all studied species (58.37%). This nearly agreed with Skuza et al. (2019) who observed that nucleotide sequences had a high variability within matK and rbcL regions. Polymorphism of the sequences was 2.2% in the rbcL region, while in the matK region was 6.5%. The most variable trnH-psbA (15.6%) intergenic region was the most useful for rye barcoding so different DNA barcodes should be used. This indicates that the matK region is suitable for differentiation and discrimination between the studied species.

Awad et al. (2017) performed DNA barcoding using matK and rbcl barcodes to discriminate 18 different Egyptian Triticum accessions. They used a universal matK primer from previously published literature to amplify the matK gene and this primer gave 100% PCR amplification for 18 samples while DNA sequencing was successfully performed for 6 matK sequences only from 18 fragments. Also, the analysis of their results demonstrated a limited ability of matK gene in discrimination between six Egyptian Triticum accessions (Sinai-AlGora-AlArish (114), Sinai-AlGora-AlArish (113), Northern coast-Raas ElHekma (117), Northern coast-Matroh (115), Bani Sweif 1, and Seds12). Their results showed the importance of in silico primer testing in the case of studies the closely related species. Their results conflicted with our results; this may be due to our using a specific primer that was designed from partial matK sequences using primer 3 version 4 online program. After that, we in silico tested the designed primer as mentioned in the primer design section. Also, we used Triticum species collected from different countries including Egypt. The Egyptian accessions that we used in our work differ from the Egyptian accession which used in their work; this may explain the expected reasons that distinguish Egyptian Triticum aestivum landraces and cultivars. Bafeel et al. (2011) found that using the universal matK primer leads to the inconsistent success rate of matk as a barcode so the universal primer needs further improvements.

The phylogenetic analysis considered the most effective method to determine the suitability of a DNA region for using as a barcode, because it should detect species-specific clusters. From our results, we documented the relation between 20 different Triticum species diploid 2n = 2x = 14 (Triticum monococcum L. AmAm (einkorn)), tetraploidy 2n = 4x = 28 (Triticum turgidium subsp. dicoccoides (wild emmer), Triticum dicoccon subsp. dicoccon (emmer) and Triticum turgidium subsp. Durum BBAuAu (Durum or macaroni wheat)) and hexaploid 2n = 6x = 42 (Triticum aestivum BBAuAuDD (common wheat)) based on partial chloroplast matK gene sequence and its translated amino acid sequence, our phylogenetic tree that discriminated all studied species was consistent with Sourdille et al. 2001 and Feuillet et al. 2008, who reported that the genome allohexaploid species (T. aestivum) is composed of genomes A, B, and D (AABBDD; 2n = 42) which is derived from three different diploid species. Whereas, T. turgidum subsp. durum is a tetraploid having Au and B genomes (AABB; 2n = 28), Au genome is originated from T. urartu while B genome is originated from Aegilops speltoides commonly known as a wild or weedy goatgrass. The Am genome is derived from T. monococcum L. (einkorn) which represents both the wild and cultivated varieties and is generally known as Triticum boeoticum Bosis. Emend. Schiem. The D genome is derived from the wild or weedy grass Aegilops tauschii L.

It was found from the cytoplasmic studies that the Sitopsis diploid species (Ae. Speltoides) was considered a maternal donor in the original cross that resulted in the tetraploid T. turgidum (Vedel et al. 1978). Several other investigations by Bowman et al. 1983 and Dizkirici et al. 2016 also confirmed that Ae. Speltoides was considered as the maternal donor and the source of the B genome of T. turgidum and T. aestivum.

The current investigation suggests the effectiveness of matK gene sequence data to resolve the phylogenetic problem in Triticum species. Also, the sequence variation, mean evolutionary rates, patterns, and transition/transversion rate in the nucleotide sequence, nucleotide diversity of matK gene can be used for the interpretation of evolutionary relationship within interspecies level of Tritium species. Finally, the matK sequence can discriminate the closely related Tritium species. So these sequences can be used as a DNA barcode for Triticum species.

Conclusion

The matK sequence has an important role in discriminating the closely related Triticum species. So these sequences can be used as a DNA barcode for detecting the evolutional history of Triticum species. It was found that there is a relation between hexaploid and tetraploid species because they are in the same group while Diploid species are in another group.