Molecular diversity analysis of genotypes from four Aegilops species based on retrotransposon–microsatellite amplified polymorphism (REMAP) markers

Genetic diversity analysis is an important tool in crop improvement. Species with high genetic diversity are a valuable source of variation used in breeding programs. The aim of this study was to assess the genetic diversity of four species belonging to the genus Aegilops, which are often used to expand the genetic variability of wheat and triticale. Forty-five genotypes belonging to the genus Aegilops were investigated. Within- and among-species genetic diversity was calculated based on REMAP (retrotransposon–microsatellite amplified polymorphism) molecular markers. Obtained results showed that REMAP markers are a powerful method for genetic diversity analysis, which produces a high number of polymorphic bands (96.09% of total bands were polymorphic). Among tested genotypes, Ae. crassa and Ae. vavilovii showed the highest genetic diversity and should be chosen as a valuable source of genetic variation.


Introduction
It is estimated that the world needs a billion metric tons of wheat (Triticum aestivum L.) in the year 2024 compared to its current production level of 777 million metric tons (International Grains Council 2019). This extra demand is expected to be fulfilled mainly through conventional breeding. However, there is a growing concern among wheat breeders that the remaining variability in the bread wheat gene pool is insufficient to address current and future breeding objectives. It was therefore concluded that there is a need to broaden the genetic base of wheat (Hegde et al. 2002). The introduction of genes from Aegilops can contribute to obtaining favorable traits, including yield, quality, and resistance to biotic and abiotic stresses (Schneider et al. 2008). Aegilops species are also potential reservoirs of resistance genes to cold (Monneveux et al. 2000), salinity (Colmer et al. 2006), as well as heat and drought (Molnár et al. 2004;Zaharieva et al. 2001). Moreover, most Aegilops species have valuable resistance genes against pathogens (Schneider et al. 2008).
Genetic diversity analysis is an important tool for crop improvement programs. Accurate assessment of the levels of genetic diversity can be invaluable in crop breeding for identifying diverse parental plants to create segregating progenies with maximum genetic variability for further selection, as well as analyzing genetic variability in cultivars (Mohammadi and Prasanna 2003).
Retrotransposon-based marker methods such as interretrotransposon amplified polymorphism (IRAP) and retrotransposon-microsatellite amplified polymorphism (REMAP) are recent methods applied in genetic diversity analysis (Kalendar et al. 1999). REMAP is a method based on the amplification of DNA regions between retrotransposons and microsatellites using inter-simple sequence repeat (ISSR) primers and retrotransposon primers, while IRAP is based on the amplification of sequences between two retrotransposons using two retrotransposon primers (Roy et al. 2015).
The aim of this study was to assess the genetic diversity of 4 species belonging to the genus Aegilops, which are often used to expand wheat, as well as triticale genetic variability.

DNA isolation
For DNA isolation, fragments of young seedlings of analyzed genotypes were collected and frozen in liquid nitrogen. Each sample consisted of plant material from 6 plants. DNA extraction was performed using the GeneMATRIX Plant & Fungi DNA purification kit (EURx, Gdańsk, Poland) according to the manufacturer protocol. The isolated DNA was quantified and qualified using a NanoDrop (Thermo Scientific, Madison, USA) spectrophotometer.

REMAP amplification
In the first step, 23 microsatellite-specific primers and 2 REMAP primers were tested. Of the tested primers, 1 REMAP and 8 microsatellite primers were selected (Table 2).
The amplification products were electrophoretically separated in 1.5% agarose gel containing 0.01% ethidium bromide in 1 × TBE buffer (89 mM Tris Base, 89 mM boric acid, 2 mM EDTA pH 8.0). The separation was carried out for 3 h using 110 V. The gel-separated DNA fragments were illuminated with a UV transilluminator and archived with the DigiGenius system (SynGene).

Data analysis
The bands were converted into a binary matrix where "1" was for the presence and "0" for the absence of a band at a particular position. Only bright and reproducible products were scored. The level of polymorphism of the primer (polymorphic products/total products) and relative frequency of polymorphic products (genotypes where polymorphic products were present/ total number of genotypes) (Belaj et al. 2001) were calculated. Resolving power of the primer was calculated using the formula: resolving power (Rp) = Σ Ib (band informativeness). Band informativeness was calculated for each band scored by the primer individually. Ib = 1-[2(0.5-p)], p is the proportion of occurrence of bands in the genotypes out of the total number of genotypes (Prevost and Wilkinson 1999). Polymorphic information content (PIC) was calculated by applying the simplified formula (Anderson et al. 1993): PIC = 2fi(1 − fi), where fi is the percentage of the ith amplified band present.
The genetic diversity parameters: percentage of polymorphic loci (P%), number of observed (Na) and effective (Ne) alleles, genetic diversity of Nei (He), Shannon index (I) were calculated using GeneAIEx v.6.4 program (Peakall and Smouse 2012). To detect the distribution of genetic variation within and among populations, the analysis of molecular variance (AMOVA) and genetic distance parameters of Nei and Li (1979) were estimated. The main coordinate analysis (PCoA), using Euclidean distance, was performed to represent the genetic distances between the populations, through the GeneAIEx v.6.4 program (Peakall and Smouse 2012). The genetic similarities matrix from the calculative data was used to construct a dendrogram based on the unweighted pair group method with arithmetic averaging (UPGMA), and a dendrogram was constructed for all genotypes using PAST software (Hammer et al. 2001).

Results
Genetic diversity of 45 genotypes belonging to 4 species of the genus Aegilops was analyzed based on a set of 8 marker pairs, constituting a combination of the REMAP primer and Table 1 List of analyzed genotypes *According to Kimber and Tsunewaki (1988) **According to information provided by genebank ***Country of origin may be assumed as country of accession donor (accession Ae.vavilovii (Zhuk.) are not expected to originate from these regions)  Table 3.
AMOVA was calculated as a measure of genetic diversity. The results indicated that 66% of the total variance occurred between species, while 34% was attributed to variation within the species. There was a significant genetic variation found (P < 0.001) between the analyzed Aegilops species. Detailed genetic variation parameters are presented in Table 4. The percentage of polymorphic loci (%P) was 32.05 and ranged from 27.98 for Ae. juvenalis to 36.01 for Ae.crassa.
Ae. crassa showed the highest genetic diversity with I = 0.157 and He = 0.101, while Ae. juvenalis showed the lowest one with I = 0.112 and He = 0.071. Ae. crassa also showed the highest number of different bands and bands with a frequency > = 5%. The highest number of private bands was also observed for this species (Fig. 1).
To explore the relationships among the individuals, the Dice genetic similarity coefficient was calculated based on the binary data matrix. The values of similarity indexes for all analyzed genotypes ranged from 0.210 to 0.987, and the mean value was 0.466. The similarity matrix index was also calculated for each species separately. The highest similarity values were obtained for genotypes belonging to the species Ae. juvenalis. They ranged from 0.612 to 0.987, and the mean value was 0.826. Similarity within Ae. neglecta was in the range of 0.510-0.978, and the mean value was 0.762. The obtained results showed that genotypes belonging to Ae. crassa and Ae. vavilovii were the most diverse. The values of similarity coefficients for Ae. crassa were in the range of 0.517-0.969, with the mean value of 0.751, while for Ae. vavilovii range of 0.625-0.964, with the mean value of 0.748.   The obtained similarity matrices of indexes were used to construct a dendrogram using the UPGMA method. Forty-five genotypes clustered into 3 separate groups. Genotypes belonging to Ae. neglecta and Ae. juvenalis formed separate groups, while Ae. crassa and Ae. vavilovii co-clustered in one group. On the dendrogram, Ae. crassa, Ae. vavilovii, and Ae. juvenalis genotype groups were located closer to each other. The group of Ae. neglecta genotypes formed a separate branch (Fig. 2).
The relationships between 45 analyzed genotypes were determined using PCoA. PCoA analysis led to the results comparable to those obtained using UPGMA clustering. Aegilops genotypes formed 3 distinct groups, corresponding to UPGMA clusters (Fig. 3). The first three principal components explained 67% of the total data variation,  Separate dendrograms were created for each species based on the obtained matrices. Genotypes belonging to Ae. neglecta formed 2 groups of 6 and 9 genotypes, respectively. The first group contained 4 genotypes from Western Europe and 2 genotypes from Greece and Bulgaria. In the second group, genotypes from the Balkan countries and Asia and Turkey clustered together. The Ae1440 genotype from Greece was located on the outskirts of the dendrogram, and it differed the most from other genotypes belonging to the species Ae. neglecta.
Genotypes belonging to the species Ae. crassa formed 3 groups of genotypes with a similar origin. Genotypes from Afghanistan and Uzbekistan grouped together, similar to genotypes from Jordan and Turkey. In the third group genotypes from Tajikistan and Turkmenistan were clustered.
On the dendrogram constructed for Aegilops vavilovii genotypes, accessions from Uzbekistan and Turkmenistan formed a close group, while genotype from Morocco was most different and located on the outskirts of the dendrogram.
Ae. juvenalis genotypes formed two main cluster groups on the dendrogram. In group I, there were 3 subgroups with 2, 5, and 3 genotypes, respectively. Group II contained 3 genotypes. The PI574463 genotype was located on the periphery of the dendrogram.

Discussion
Molecular analysis using REMAP and ISSR markers is an efficient and inexpensive method to evaluate genetic diversity that has been used repeatedly to determine genetic relationships of many plants (Mahjoob et al. 2016;Ghufaili and Al-Tamimi 2017;Cheraghi et al. 2018;Vuorinen et al. 2018). REMAP markers generate a wide range of band patterns. They enable identification of polymorphisms between species as well as within a single species or cultivars (Kalendar et al. 1999;Branco et al. 2007;Mahjoob et al. 2016;Cheraghi et al. 2018).
In our study, REMAP markers were used to evaluate genetic diversity within and between 3 species of the genus Aegilops used in crossbreeding to introduce genetic diversity of triticale. Among the obtained bands, 96.09% were polymorphic, which has confirmed that these markers are an effective tool for the analysis of genetic diversity of species from the genus Aegilops. Paczos-Grzęda and Bednarek (2014) used REMAP markers to analyze polymorphism of hexaploid species of the genus Avena and obtained 73.5% of polymorphic products. Analogous studies on common wheat were conducted by Holasou et al. (2019), and these authors obtained the percentage of polymorphic loci (PPL) at the level of 86.4%. REMAP markers were also used to analyze genetic similarity in triticale. The obtained level of polymorphic bands was 88.46% (Kalinka and Achrem 2018).
PIC (polymorphism information content) and Rp (resolving power of the primer) are the measures of marker usefulness for similarity analyses (Powell et al. 1996). PIC is the probability of a primer detecting polymorphism between individuals and depends on the number of detectable alleles and distribution of their frequency. Rp specifies the discriminatory potential of primers and estimates the ability of the technique to produce optimally informative bands (Yousefi et al. 2015). In the current study, both PIC and Rp were calculated for the applied marker combinations. The obtained values (PIC = 0.25 and Rp = 28.13) confirmed the usefulness of the combination of REMAP and ISSR primers for the analysis of genetic similarity of genotypes belonging to the genus Aegilops. Taheri et al. (2018) analyzed genotypes of the genus Triticum and obtained a PIC of 0.40 using REMAP markers.
Genotypes belonging to different species of the genus Aegilops were analyzed in order to assess the level of genetic diversity. Pour-Aboughadareh et al. (2018) analyzed 8 Aegilops species and 4 Triticum species using SCoT (Start codon targeted) markers. Authors showed that SCoT markers, as well as REMAP markers, were a good tool for analyzing the genetic diversity of species within the genus Aegilops. They obtained 98.3% of polymorphic products. In our own research, the level of polymorphism was 96.09%. Thomas and Bebeli (2010) also observed a high level of polymorphism Aegilops genotypes using the RAPD and ISSR method. Abbasov et al. (2019) used SSR markers to evaluate the genetic diversity of Aegilops genotypes from Azerbaijan and Georgia, but the number of obtained amplicons was much lower than for REMAP or SCoT markers. The results obtained in this study showed that the most diverse species were Ae. crassa and Ae. vavilovii. Genotypes belonging to these species exhibited the highest values of genetic variation parameters and the lowest values of similarity coefficients. Pour-Aboughadareh et al. (2018) demonstrated that Ae. cylindrica and Ae. umbellulata were characterized by the highest levels of genetic diversity among the analyzed Aegilops species, while Ae. crassa had a lower level of diversity. Research conducted by Pour-Aboughadareh et al. (2018) showed a lower level of Ae. crassa differentiation compared to Ae. neglecta. In the present study, Ae. crassa was characterized by the highest level of diversity, whereas Ae. neglecta had a medium level of similarity; Congruent conclusions can be found in the studies of Abbasov et al. (2019) and Pour-Aboughadareh et al. (2018). The authors of the latter study argued that such discrepancies in the results of many works could be caused by differences in the sample size or geographical origin of the analyzed accessions.
Clustering based on genetic distance was consistent with taxonomy. UPGMA and PCoA analyses based on REMAP markers allowed to discriminate all Aegilops species at the cluster level. No grouping according to the country within the species was observed, but we recorded a certain tendency of clustering the accessions from the same geographical regions. A similar trend was observed for genotypes from Azerbaijan and Georgia. In contrast, Pour-Aboughadared et al. (2018) showed that genetic diversity did not correspond to geographical distribution. They observed that all species clustered based on their genomic structure and these clusters were approximately consistent with their taxonomic classification. These differences may also result from a different number of samples and their different origin as well as the number of species analyzed.
In summary, we used molecular markers to investigate the variability within and between 4 hexaploid Aegilops species, which are often utilized in wheat breeding programs as a source of genetic diversity. Our research demonstrated that Ae. crassa and Ae. vavilovii were the most valuable species, which showed the highest values of genetic diversity parameters and the lowest genetic similarity indexes.