Keywords

12.1 Introduction

Bread wheat is one of the most important staple crops and provides over 1/5th of the calories consumed by the world’s population (FAOSTAT 2020). Global wheat production needs to be increased in light of the growing human population and changing climatic conditions (Hickey et al. 2019; Ray et al. 2012, 2013; Tilman et al. 2011). To cope with the numerous challenges that wheat faces, such as heat, drought, and diseases, it is important to find useful sources of genes and alleles for its improvement, and at the same time, develop approaches for efficient transfer of this useful genetic variability to cultivated wheat. Efforts have already been made in this direction, with the major and successful efforts that have been made after the wheat genome reference assembly using T. aestivum cv. CHINESE SPRING became available as a model in the 2018 (Appels et al. 2018). Since then, more and more resources have been added up to speed up breeding activities and the development of markers for important traits. For instance, in the years 2019 and 2020 a wheat pan-genome resource containing an assembly of 10+ wheat genomes including elite cultivars from across the globe and a 1 K exome capture data were generated (He et al. 2019; Walkowiak et al. 2020). In fact, although high-quality reference assembly is available for CHINESE SPRING, it does not capture the complete species-specific variation that can be exploited for variety development. Therefore, the above genomic resources including the pan-genome and exome capture data have proven to be highly useful. These resources have also been exploited for identification of useful wild introgressions in wheat followed by marker development for biotic and abiotic stress tolerance traits. The current pan-genome resource consists of ten genomes with pseudomolecules level assembly and five genomes with assemblies of hexaploid wheat.

One of the major objectives of any breeding program has been to develop resilient wheat varieties against environmental conditions as well as biotic stresses and significant progress has been made in the genetic improvement of wheat, mainly after the green revolution either using conventional or molecular breeding approaches through marker assisted selection. The introduction of dwarfing genes during the green revolution revolutionized wheat variety development and led to dramatic increase in wheat yield across the globe (Ali et al. 1973; Hedden 2003; Pingali 2012). Similarly, important genetic markers have also been identified for the QTL/genes providing resistance against different biotic and abiotic stresses (Saini et al. 2022; Singh et al. 2021). This has certainly led to the enhancement in the breeding populations of wheat; however, at the same time it has also narrowed down the genetic base thus resulting in reduced species variability. This ultimately necessitates the need to explore the wild and related species of wheat which are an important reservoir of useful genetic diversity as well as genes for biotic and abiotic stresses.

Based on the evolutionary distance between the species and the success rate of interspecies hybridization, Harlan and de Wet (1971) introduced the idea of wheat gene pools that included primary, secondary, and tertiary gene pool (Fig. 12.1) (Jiang et al. 1993; Mujeeb-Kazi et al. 2013). While, the genomes of primary and secondary gene pool share some homology with the wheat genome, the species in the tertiary gene pool do not share any homology with the wheat genome and, therefore, are sexually incompatible through homologous recombination. It is also difficult to cross the species of secondary and tertiary gene pool with hexaploid wheat when compared to the species of primary gene pool (Mujeeb-Kazi et al. 2013).

Fig. 12.1
figure 1

Overview of bread wheat’s gene pools with examples in each category

The species in the primary gene pool include modern wheat cultivars and other T. aestivum landraces, Triticum spelta (AABBDD), tetraploid durum wheat T. turgidum (AABB), diploid wheat species T. urartu (AA), and Aegilops tauschii (DD). Examples of species in the secondary gene pool are tetraploid species T. timopheevii (AAGG), and diploid species T. monococcum (AmAm) and Ae. speltoides (SS). Species in the tertiary gene pool include cultivated species such as rye (RR) and barley (HH) as well as wild relatives of wheat. Importantly, wild relatives of wheat contain a treasure trove of variability that can overcome the genetic bottlenecks found in bread wheat (Tiwari et al. 2015). Examples of these are wild grasses such as diploid Thinopyrum elongantum (EE), tetraploid Ae. geniculata (UUMM), and octoploid Leymus arenarius (XXXXNNNN) (Pour-Aboughadareh et al. 2021; Anamthawat-Jónsson 2001). Due to the absence of pairing at meiosis between the tertiary pool chromosomes and those of wheat, techniques such as radiation induced chromosomal breaks or gene editing must be used to create introgression lines (Benlioğlu and Adak 2019; Jiang et al. 1993; Mujeeb-Kazi et al. 2013).

As mentioned above, the availability of genomic resources in hexaploid bread wheat has driven the development of useful markers leading to stress resilient wheat cultivars. However, looking at the complexity of the wheat genome owing to its large genome size and polyploid nature, it became necessary to develop genomic resources for the above wild relatives of wheat. Considerable progress has already been in this direction. For example, diploid relatives Ae. longissima, Ae. speltoides, and Ae. sharonensis, as well as several accessions of Ae. tauschii all have recently released reference quality assemblies available for BLAST and genome browsing (Avni et al. 2022; Gaurav et al. 2022; Zhou et al. 2021). Further, wild tetraploid species T. turgidum ssp. dicoccoides v. “ZAVITAN” have also recently had a high-quality assembly released with the use of optical maps for more accurate scaffolding.

The present chapter is mainly focused on providing an overview of the available reference assemblies, and genomic resources in wheat’s wild relatives, which have been explored to identify useful introgressions in wheat. Some examples include (i) Fhb7 (from T. elongatum) providing resistance against Fusarium head blight in wheat (Guo et al. 2015); (ii) the well-known 1BL/1RS translocations from rye which has useful genes for improved grain yield and biomass especially under abiotic stress (Lukaszewski 1993), Lr57 and Yr40 from Ae. geniculata providing resistance against rust disease (Kuraparthy et al. 2007a, b). Recent developments in the next generation sequencing technologies have led to the development of low-cost sequencing reactions such as skim sequencing which provides a useful resource for the identification of alien introgressions with even a low coverage of less than 0.1x (Adhikari et al. 2022b). A comparative overview of synthetic relationships between wheat and wild relatives is also discussed. Overall, the present chapter will serve as a useful resource for the students and researchers working in alien wheat genomics and exploring useful alien wheat introgressions in development of wheat cultivars.

12.2 State of Reference Assemblies in Wheat and Its Wild Relatives

Wild and related species in wheat are a reservoir of important genes for different abiotic and biotic stress tolerances. Therefore, the availability of genomic resources for these wild relatives will prove to be an asset for identification of genes/QTLs and their linked markers which may be helpful in simplifying wheat genomics leading to development of elite wheat cultivars which is otherwise difficult due to complex and large wheat genome. Reference genome assemblies are now available for some of the important wild species belonging to all the three wheat gene pools. Reference assemblies for the important wheat relatives are explained in brief below.

12.2.1 Primary Gene Pool Reference Genomes

The first draft of the reference genome of bread wheat first became public in 2014, utilizing survey sequencing of individual chromosomes. Though this is considered a significant breakthrough in the world of wheat genomics, this initial draft sequence only accounted for ~61% of the entire wheat genome (Lukaszewski et al. 2014). Four years later, with the use of additional genetic data, including radiation hybrids, and sequence data, with the advancement of next generation sequencing (NGS) technologies, the fully annotated CHINESE SPRING reference genome was released with pseudomolecule assemblies for all 21 chromosomes (Appels et al. 2018). This reference genome has been continuously updated with the use of new technologies, both with the intent of more accurate contig establishment and scaffolding as well as annotation of genes not initially reported in the V.1.0. (Alonge et al. 2020; Zhu et al. 2021). Extensive comparative data shows that CHINESE SPRING is a genetic outlier when compared to domesticated species of Triticum sp. (Walkowiak et al. 2020).

The development of the pan-genome of wheat has allowed for more precise research and insight into the primary gene pool of wheat, including T. spelta. As of December 2022, 13 cultivars of wheat and one cultivar of T. spelta are available for BLAST as well as genome browsing. Interestingly, with the information gained by the 10+ genome project, alien introgressions were able to be traced using reads derived from T. timopheevii and T. ponticum (JJJJJJsJsJsJsJs) in T. aestivum cv. LANCER, and Ae. ventricosa (NvNvDvDv) in T. aestivum cv. JAGGER in order to get more exact coordinates of these loci.

Tetraploid species of both cultivated (T. durum) and wild emmer (T. dicoccoides) wheat are also a part of the primary gene pool, due to the ability for homologous recombination to occur within the shared sub-genomes (A and B). When compared to hexaploid wheat, only 5% of wheat grown for human consumption is durum, and 95% is hexaploid. This may be attributed to the genome plasticity of hexaploid wheat which allowed for a broader potential for adaptation compared to tetraploid wheat (Mastrangelo and Cattivelli 2021). Also, compared to hexaploid wheat, the elite gene pool of durum wheat has little genetic diversity, and most elite durum wheat cultivars are moderately to highly susceptible to disease resistance breeding (Clarke et al. 2010; Miedaner and Longin 2014). This is also not surprising due to the widely known fact that hexaploid bread wheat actually evolved from an inter-specific hybridization between T. dicoccoides and diploid species Ae. tauschii (Dvorak et al. 2012; Lukaszewski et al. 2014; Mcfadden and Sears 1946). However, it is evident from the published reports that wild emmer introgressions were responsible for significant gains in genetic diversity among the hexaploid lines as shown recently using the 1000 Wheat Exome Project (He et al. 2019). Similarly, the phenotypic variance contributed by several important traits including harvest weight, drought response, and plant height is largely attributed to these wild emmer introgressions (Nigro et al. 2022; Zhu et al. 2019).

Looking into the importance of wild emmer introgressions in hexaploid bread wheat, improved reference genomes of both wild emmer and cultivated durum wheat were published in 2019. The improved reference genome of wild emmer wheat cv. ZAVITAN (WEW) utilized optical maps as well as advancements in alignment technologies in order to increase the effective size of the reference genome by ~67 Mb, as well as adding over 2,000 high confidence genes. Additionally, between WEW_v1.0 and WEW_v.2.0, gaps of unknown size dropped from 2,767 to only 471 (Avni et al. 2017; Zhu et al. 2019). Later in 2019, a high-quality reference genome of T. durum cv. SVEVO was published, and by utilizing the WEW data, it was shown that the short-term evolutionary changes showed little change to synteny between WEW and durum. There were, however, lower copy numbers of important gene families such as NLRs in SVEVO in comparison with Zavitan, which implies a reduction of canonical R-genes (Maccaferri et al. 2019).

Diploid progenitor species of bread wheat genomes A (T. urartu) and D (Ae. tauschii), as well as close B genome relative Ae. speltoides (SS) all serve as a less complex system to work with for genomics research than the hexaploid bread wheat (Kerby and Kuspira 1987). Therefore, in recent years, reference genomes for all the three wheat genome donors (A genome; T. urartu, B genome: Ae. speltoides; D genome: Ae. tauschii) have been produced in order to help with wheat improvement. While the donors for A and D genome are included in the primary gene pool, the donors for the B genome are included in secondary gene pool. Therefore, the reference assemblies for the donors of A and D genome are discussed in more detail below, and the reference assemblies for the B genome donor (Ae. speltoides) are discussed in separate sub-heading in the next section involving secondary gene pool.

12.2.2 A Genome

The T. urartu reference genome was first published in 2018 (Ling et al. 2018), four months before the release of the CHINESE SPRING v.1.0 reference genome. In their analysis, done using the 2014 draft wheat genome v. 0.4, strong structural variations were observed between the T. urartu A genome and the bread wheat A genome, proposing evolutionary rearrangements. Within the diverse population of T. urartu accessions used for this study, and using the reference genome, three distinct groups were identified in the Fertile Crescent. The above diverse accessions were screened for powdery mildew resistance, and excitingly, after inoculation with powdery mildew (PM), one group (group 2) showed significant resistance against the pathogen. Further, analysis using the SNP data revealed a single putative candidate gene that was involved in providing resistance against powdery mildew. This resistance was perhaps due to the natural selection for powdery mildew resistance as well adaptation to grow at high altitudes.

12.2.3 D Genome

The D genome progenitor Ae. tauschii is a well of genetic variability in wheat, due to the low level of variation seen within D genome of wheat (Dubcovsky and Dvorak 2007; Voss-Fels et al. 2015). This lack of variation is partially due to the small proportion of diversity that was obtained during polyploidization when hybridization between ancient, domesticated T. turgidum (AABB), and the small population of Ae. tauschii near the Caspian Sea (Dubcovsky and Dvorak 2007; Gaurav et al. 2022; Luo et al. 2017; Voss-Fels et al. 2015). However, due to the ability to develop synthetic wheat by hybridizing tetraploid species with Ae. tauschii, diversity in the D genome can be integrated into the breeding germplasm (Li et al. 2018). The first Ae. tauschii reference genome was released in 2017 in the background of accession AL8/78; the current version (Aet v.5.0) has been improved using optical maps as well as Pac-Bio long-read sequencing (Luo et al. 2017; Wang et al. 2021).

Since the initial release, several strides have been made in Ae. tauschii genomics. For instance, Zhou et al. (2021) developed reference quality genomes of four additional accessions representing four sub-lineages of Ae. tauschii with the intent to trace wild introgressions better in the germplasm. In the same year, the Open Wild Wheat Consortium (OWWC) generated whole genome sequencing (WGS) data for 242 non-redundant accessions of Ae. tauschii, to probe the evolution of bread wheat, determine the variation within the population, and perform genome-wide association studies (GWAS) for important traits using the AL8/78 reference genome (Gaurav et al. 2022). This study was able to show the two major lineages that make up the D genome in wheat, and using the wheat pan-genome, show the physical regions that come from these lineages. Additionally, a third lineage not associated with the evolution of bread wheat was also characterized.

Further, using k-mer-based GWAS, candidate genes for flowering time, stem rust (Sr) resistance, trichome number, spikelet number, PM resistance, and wheat curl mite resistance were also reported. Efforts are currently underway by the OWWC to develop a pan-genome resource for Ae. tauschii which will provide further information pertaining to the diversity prevailing in the genome sequences of diverse Ae. tauschii accessions (openwildwheat.org).

12.2.4 Secondary Gene Pool Reference Genomes

In comparison with the primary gene pool of wheat, genomic resources for members of the secondary gene pool are limited. Therefore, efforts are being made in this direction. For instance, (i) the development of reference assemblies for wild and cultivated T. monococcum accessions are available for public use (Ahmed et al. 2023).

Diploid wheat T. monococcum which is a close relative of T. urartu (A genome donor) is the only species with both domesticated (T. monococcum ssp. monococcum) and wild type (T. monococcum ssp. aegilopoides) accessions. Therefore, the reference assemblies for these species once available will certainly help in simplifying wheat genomics and may be an improvement over the reference assembly available for T. urartu. (ii) Transcriptome data for T. monococcum is also available from an earlier study (Fox et al. 2014). (iii) A core set of wild einkorn as well as domestic einkorn was also recently categorized by Adhikari et al. (2022a). Using GBS data, 145 domesticated einkorn accessions and 584 wild einkorn accessions were divided into α, ß, γ, and monococcum. A set of T. urartu accessions were also a part of this study, and as expected, they clustered together distally from T. monococcum accessions.

When compared to A and D genome, B genome of wheat has been difficult to study in a diploid species due the proposed extinction of the direct progenitor (Riley et al. 1958; Sarkar and Stebbins 1956). Researchers, however, have found a workaround this issue by working with species in the Sitopsis section of Aegilops (S*S*) due to their close relatedness with the B genome (Kerby and Kuspira 1987). In the last decade, reference quality genomes for five Sitopsis species were released to help with additional resources for not only the elucidation of the B genome of wheat, but also as a further resource in researching the D genome of wheat (Li et al. 2022; Sandve et al. 2015; Yamane and Kawahara 2005; Yu et al. 2022). Recently, Avni et al. (2022) also communicated the release of three reference quality genomes in the same section (Sitopsis) which included two new assemblies for Ae. sharonensis (SlSl), and Ae. speltoides and one assembly for Ae. sharonensis which was in fact first communicated by Yu et al. (2022).

Alignments of the above assemblies with the different sub-genomes of wheat revealed a strong linear alignment of Ae. sharonensis and Ae. longissima with the D genome of bread wheat, and that of Ae. speltoides with the B genome which is obvious due to their strong relationship with the respective sub-genomes (Fig. 12.2). This was also further supplemented with the clustering of high confidence gene annotations of Ae. sharonensis and Ae. longissima with bread wheat’s D genome as well as Ae. tauschii, and that of Ae. speltoides with WEW, durum wheat, and bread wheat’s B genome.

Fig. 12.2
figure 2

Synteny between diploid wheat chromosomes 1A, 1S, 1D and hexaploid bread wheat’s genome

In March 2022, reference assemblies of two additional “S” genomes (Ae. bicornis (SbSb) and Ae. searsii (SbSb)) were communicated, finally completing the Sitopsis section of the Triticeae. Both the above S genome assemblies also clustered with the D genome and D genome progenitors of wheat, in comparative alignments showing their closer association within the ancestry of wheat’s evolution. Interestingly, with this complete information, it was found that the divergence of the D-related Sitopsis clade from the D progenitors was predicted to have happened around 5.23 Mya, whereas Ae. speltoides and the B genomes of both durum and bread wheat happened more recently at 4.44 Mya. With these available genomes, more precise genomic research can now be performed in the B genome of wheat, as well as diving deeper into the evolution of the D genome and its progenitors.

12.2.5 Tertiary Gene Pool Reference Genomes

The tertiary gene pool of wheat is underrepresented in terms of resource availability and research, due to the difficulty in defining these species, as well as limited genomic information (Qi et al. 2007; Schneider et al. 2008; Tiwari et al. 2015). Discussions in the literature have considered the Sitopsis section species as members of the tertiary gene pool, not including Ae. speltoides, but since the recent advancements in their genomic resources, it is more fitting to place them in the secondary genepool. Although some species, such as T. elongantum (EE), have had assemblies and annotations competed for attention with regards gene cloning, no reference genomes for wild grasses in the tertiary gene pool are currently available (Wang et al. 2020).

Two cultivated species, on the other hand, belonging to the tertiary gene pool, barley (Hordeum vulgare; 2n = 2x = 14; HH) and rye (Secale cereale; 2n = 2x = 14; RR), have had reference genomes published in the last ten years. The original barley genome was the first species in the Triticeae tribe to have a reference genome (Melonek and Small 2022; Mochida and Shinozaki 2013; Purugganan and Jackson 2021). Originally sequenced and annotated in 2012, one of the biggest achievements in this assembly was overcoming the size and complexity of cereal genomes, due to the highly repetitive elements (Mayer et al. 2012). Since the release, updates have been made to properly order the chromosomes and create a better physical map, as well as reduce the unanchored sequences from ~250 to 83 Mb (Beier et al. 2017; Mascher et al. 2017; Monat et al. 2019). In a similar fashion to the achievements in wheat, a pan-genome project was also developed in barley, which included the sequencing and assembly of 19 additional barley lines including two highly transformable lines (GOLDEN PROMISE and IGRI) as well as a wild barley genotype (Jayakodi et al. 2020). This resource was, and is, an important milestone in the advancement of cereal crop genomics due to its early elucidation. Rye is an important member of the tertiary gene pool as a contributor of high tolerance for both biotic and abiotic stresses. Additionally, rye has been an important player in wheat breeding due to the importance of the 1BL/1RS and 1AL/1RS which confer resistance to multiple biotic diseases (Zeller and Sears 1973; Jung and Seo 2014). Moreover, synthetic hybrids of rye and wheat, named Triticale, have gained popularity due to their nutritional value as forage (Zhu 2018).

To better understand the underlying genetics behind the important aspects of rye, two reference quality genomes of wheat were released simultaneously in 2021. In the article by Rabanus-Wallace et al. (2021), a chromosome scale assembly was developed in the background of cv LO7, showing similar genomic makeup as other members of in Triticeae, and strong collinearity with the barley genome. Using this assembly, the researchers were able to determine a translocated region conferring frost tolerance in a 5A/5RL translocation line, first denoted using chromosome labeling and confirmed using read depth analysis on bread wheat’s 5A chromosome and rye’s 5R chromosome. In another article by Li et al. (2021), an additional genotype of rye, cv. WEINING, had a reference assembly created, which provided further support for the strong collinearity between the tertiary gene pool genomes. In their study, utilizing 2,517 single-copy orthologous genes, Li et al. (2021) developed a phylogenetic tree depicting 12 grasses and their evolutionary divergence. Although it is not necessarily new information, with the rye genome sequenced, the authors were able to compare rye with the other 11 sequenced genomes to deduce that rye had diverged from wheat ~5 Mya after barley and wheat’s divergence, giving further evidence of rye’s closer relationship with bread wheat and its progenitors. For a summary of the state of reference genomes in Triticeae from the past five years (see Fig. 12.3).

Fig. 12.3
figure 3

Timeline of reference genomes in Triticeae from 2017 to 2022

12.3 Alien Introgressions and Comparative Genomics

As described above, wild wheat relatives play an important role in the production of high performing wheat cultivars. Modern breeding techniques have reduced the genetic diversity in the breeding germplasm to select for higher yield (Keilwagen et al. 2022; Sansaloni et al. 2020; Schneider et al. 2008). Utilizing DNA segments from wild relatives that have been integrated into bread wheat’s genome is a method to overcome this reduction in genetic diversity (Fig. 12.4); however, methods for detecting these introgressions are a must to properly trace these segments in breeding programs (Hao et al. 2020; Molnár-Láng et al. 2015). In this section, we will describe the methods, both old and new, that researchers utilize to detect and trace these introgressions, describe the important genes that come from these introgressions, as well as show the usefulness of modern technologies for comparative genomic analysis.

Fig. 12.4
figure 4

Methods for developing introgression lines from wild relatives coming from a the primary gene pool and b the secondary and tertiary gene pools

12.3.1 Methods for Detecting Alien Introgressions

Different methods for detecting the alien introgressions can be broadly classified into cytological/cytogenetic, PCR-based markers and Recent Next Generation Sequencing (NGS)-based methods including skim sequencing.

12.3.2 Cytological Methods

Cytological methods for detecting chromosomal morphological differences have been used for almost 100 years (Gill and Friebe 1996). A popular method for observing different sizes and compositions of chromosomes was achieved by using centromeric heterochromatin staining, or C-banding, which allows for visualization of chromosomes and/or karyotypes of different species on a conserved scale (Endo and Gill 1996; Gill et al. 1991). This method was used for detecting rye/wheat hybrid pairing as far back as 1977 as well as determining T. timopheevii introgressions in T. timopheevii × T. aestivum hybrids (Badaeva et al. 1991; Dhaliwal et al. 1977). The C-banding method used alongside genomic in situ hybridization (GISH) also allowed for the detection of introgressions from Ae. umbullata (UU), Ae. speltoides, Ae. comosa (MM), Ae. longissima, and T. timopheevii as well as several others as far back as the early 90s (Friebe et al. 1996).

More recently, regions of Leymus racemosus DNA containing important Fusarium Head Blight (FHB) resistance gene Fhb3 introgressed into bread wheat were traced using GISH and C-banding (Qi et al. 2008). Another, still popular, method of visualizing introgressions in wheat is the use of fluorescence in situ hybridization or FISH, which utilizes fluorescent-labeled DNA probes to detect important regions of chromosomes, such as introgressions (Campos-Galindo 2020; Jiang and Gill 2006). This method is still much in use today to provide further evidence of translocations in wheat, including the previously mentioned frost tolerance associated region in rye introgressed into wheat background (Rabanus-Wallace et al. 2021). This method has also been used to dissect introgressions coming from T. elongantum, Ae. columnaris (UcUcXcXc), Ae. caudata (CC), T. timopheevii, as well as many more not noted here (Badaeva et al. 2017; Devi et al. 2019; Grewal et al. 2020; Guo et al. 2022). Another use of this method was described in 2021, where FISH and GISH markers were utilized to visualize the recombination patterns of susceptible vs resistant genotypes of Ae. geniculata (UgUgMgMg) introgression lines in F3 families (Steadham et al. 2021).

12.3.3 PCR-Based Markers

Another method to detect alien introgressions is by using PCR-based markers that are polymorphic between bread wheat and the wild species. The use of PCR-based markers for identifying alien introgressions in bread wheat dates back to early 90s when Rogowsky et al. (1993) designed PCR and RFLP markers to detect famous 1AS.1RL, 1BS.1RL, and 1DS.1RL rye introgressions in wheat background. Since then, PCR-based markers are continuously being implemented for identifying introgressions. More recently Li et al. (2019) designed markers to detect Thinopyrum intermedium ssp. trichophorum (JJJsJsStSt) introgressions in wheat that provide significant stripe rust resistance. To illustrate the importance of old and new technologies, these researchers utilized GISH, FISH, and C-banding in order to validate the effectiveness of the PCR markers, which now can be utilized in marker assisted breeding (MAS) to incorporate these genes into the breeding germplasm. Further, polymorphic SSR markers were also developed recently to detect introgressions from synthetic amphidiploid species T. kiharae (AtAtGGDD) which holds a reservoir of genes that have the potential to improve resistance to many diseases as well as increase the quality of flour production (Orlovskaya et al. 2020).

12.3.4 NGS Technology

With the advent of cost-effective NGS methods, researchers now have the ability to obtain sequence data coming from the transcriptome, exome, as well as the whole genome. This data can be generated from any species that the researchers are interested in, including the wild relatives of wheat. Examples of this have been mentioned in Sect. 12.2 of this chapter, in regard to whole genome assembly; however, data for wild relatives is constantly being generated for purposes of gene mapping and cloning, as well as diving deeper into wild relatives. One such example comes from Tiwari et al. (2015), where the 5 Mg chromosome of Ae. geniculata was sorted, sequenced, and assembled to gain insight into this important species. This information helped with the fine mapping of Lr57 and Yr40 in translocation wheat lines (Steadham et al. 2021).

In the past 5 years, NGS data has been utilized to detect introgressions in Triticeae species without the additional step of SNP calling, which can create artifacts as well as require more computational resources (Li and Wren 2014). Genotyping by sequencing (GBS) data provides short and low coverage genomic data, usually for the purpose of creating VCF files in order to genotype a population with relatively low computational and storage requirements (Perea et al. 2016). This data has now been shown to be able to discern introgressions in both wheat and barley. In a study by Keilwagen et al. (2019), they were able to detect putative introgressions from wild relatives in wheat, including the 1BL/1RS translocation. Interestingly, in the panel of 209 elite European winter wheat varieties in which GBS data was generated, many of the regions where introgressions were detected, these overlapped with important genes used in breeding programs such as Yr17 from Ae. ventricosa (NvNvDvDv) and Lr19 from T. ponticum as well as genes not yet known to be from wild relative introgressions such as Glu-D1 and Ppo-D1. Due to the decrease in the cost of WGS data generation, one group set out to see the benefit of using resequencing data from multiple wild relatives to detect introgressions, utilizing the 10+ genomes described above. Keilwagen et al. (2022) used wild relative WGS data from both public repositories as well data generated from their own experiments to determine the regions of wild introgressions in 10 genotypes, gathered from the 10+ wheat genome project. In doing so, 9 introgressions coming from wild relatives Ae. ventricosa, Ae. markgrafii (CC), Ae. speltoides, T. timopheevii, Ae. umbullata, Ae. uniaristata (NN), and T. ponticum were found to be present on chromosomes 2A, 2B, 2D, 3D, and 4A. The researchers determined that within introgressions found on 2AS (from either Ae. ventricosa or Ae. markgrafii), 2B (from T. timopheevii) and 2DL (from Ae. markgrafii or, Ae. umbullata) contained genes that shared >90% amino acid similarity with genes coding for leaf rust and stripe rust genes, respectively. Fascinatingly, when checking the two introgressions that were present in all 10 genotypes, on 2A and 4AL coming from Ae. speltoides, in relatives of bread wheat, these introgressions were found in T. urartu, T. boeoticum, and T. monococcum, but not in T. dicoccoides or T. spelta. The studies also determined that these introgressions were able to be detected using only 1% of the total data.

To further save on computational cost, researchers have shown that skim sequencing of genomes can be used at a coverage as low as 0.025x, to determine introgressions, as described by Adhikari et al. (2022a, b). These authors used this method to determine barley introgressions on chromosomes 7A, 7B, and 7D in a population of 384 wheat–barley introgression lines. Additionally, they screened T. intermedium-durum wheat amphiploid lines to find not only lines where there were possible introgressions, but also certain lines containing whole wheat chromosomes. Due to the efficacy and precision of this method of detecting introgressions, this method is more than likely to define what the future of alien introgression mapping procedures looks like for researchers not only in wheat, but in all important crops.

12.3.5 Agronomically Important Genes Coming from Alien Introgressions

One of the most important alien introgressions in wheat is the 1BL/1RS translocation, in which the short arm of chromosome 1R in rye has replaced the short arm of 1B of wheat. This introgression has been used in wheat breeding not only for the disease resistance that is associated with this introgression, which has since become obsolete, but also because of the increased root biomass that has a positive effect on yield (Zeller and Hsam 1983; Sharma et al. 2011; Villareal et al. 1998). Despite the negative effects of this translocation on bread making quality, ~30% of modern cultivars contain the 1BL/1RS segment (Wang et al. 2017; Zeller et al. 1982). For a list of varieties containing 1R translocations visit http://www.rye-gene-map.de/rye-introgression/index.html (see also Ru et al. 2020). In recent years new 1BL/1RS lines have been developed to overcome some of the shortcomings of older introgressed lines, in which resistance against stripe rust, as well as drought tolerance was observed (Ren et al. 2022; Sharma et al. 2022; Gabay et al. 2020).

Ae. geniculata is also a genetic goldmine due to the strong disease resistance genes that are present in some accessions. The line TA10437, in which the 5 Mg chromosome was sequenced in 2015, contains important resistance genes against nefarious pathogens such as stripe rust and leaf rust (Tiwari et al. 2015). Recently, leaf and stripe rust resistance genes, Lr57 and Yr40 respectively, have been fine mapped in Ae. geniculata translocation lines utilizing mapping populations derived from a cross between resistant TA10437 derived introgression lines and susceptible disomic 5 Mg addition lines in the background of CHINESE SPRING (Steadham et al. 2021). In this study, Lr57 and Yr40 were not only fine mapped to a 1.5 Mb region of the introgressed Ae. geniculata 5 Mg segment, but through phenotyping of the mapping population and the donor parent of the 5 Mg segment, Lr57 was shown to provide further evidence of its broad-spectrum resistance, confirming the results of an earlier study (Kuraparthy et al. 2007a, b). Moreover, this study showed that recombination is achievable in alien introgressions by crossing introgression lines with disomic lines containing homologous chromosomes of the alien species.

Sources of biotic disease resistance coming from wild relatives are unequivocally important for the sustenance and improvement of wheat; however, due to the associated linkage drag, their utilization in modern cultivars by durum and bread wheat breeders is limited for integrating “exotic” resistance genes from wild or cultivated relatives into their elite material (Hafeez et al. 2021; Steiner et al. 2019). But with the ever-increasing knowledge of wild wheat relatives, new genes that confer resistance are being integrated into the germplasm without a yield penalty. Powdery mildew and stripe rust resistance genes, Pm5V and Yr5V respectively, transferred from the annual diploid wheat relative D. villosum (VV) via amphiploid generation (T. turgidum × D. villosum, AABBVV) (Zhang et al. 2022). In order to integrate these genes into the germplasm, subsequent crossing with elite D. villosum introgression lines was performed and yielded lines with comparable yield to that of elite bread wheat lines. However, due to grain softness that is also associated with the 5 V chromosome, chemical mutagenesis was performed to knockout this undesirable trait, resulting in comparable yielding, hard grained genotypes for utilization in wheat breeding. A summary of disease resistance genes coming from wild relatives is described in Table 12.1.

Table 12.1 Disease resistance genes coming from wild relatives

Outside of resistance, genes controlling yield-related and end-use traits coming from wild relatives have also been utilized by researchers to further address the benefits of these species. Wild tetraploid wheat Agropyron cristatum (PPPP) has been used as a donor for abiotic and biotic disease resistance, as well as for yield-related traits for over 30 years (Chen et al. 1992; Zhang et al. 2015). In a study by Zhang et al. (2018), researchers found that Pubing260, a T3BL.3BS/6PL translocation line containing a small terminal introgression from Ag. cristatum had increased grains per spike, spikelets per spike, thousand kernel weight, and flag leaf width in comparison with elite bread wheat genotypes without this segment. Additionally, in 2022, a high molecular weight glutenin subunit (HMW-GS) gene coming from Ae. tauschii was directly introduced into bread wheat, and although the dough quality was reduced slightly, the quality of Chinese steamed bread increased (Bo et al. 2022).

12.4 Available Resources for Sequence Data and Plant Material

Availability and accessibility of resources is paramount for the development of higher yielding, disease resistant cultivars of wheat. Fortunately, there exists web-based databases for the extraction of genomic and transcriptomic information regarding wheat and its relatives. Furthermore, there are avenues available for requesting seed material for many of the species mentioned above. In this section, we will provide an overview of the publicly available sites that can be utilized to not only browse and obtain genomic data from bread wheat and wheat’s wild relatives but also where to request seeds from repositories across the world.

12.4.1 Web-Based Databases for Sequence Data

The National Center for Biotechnology Information is a resource for genetic research for almost any species that has had any type of sequence information generated (Sayers et al. 2022). Their user-friendly website allows for easy search for any topic, giving results for all 35 of their databases. A simple search for the term “Triticum” on December 12, 2022, yielded results in 26 of the 35 available databases. Over 4 million hits from this search go to their nucleotide database, whereas ~3 million hits come from the protein database. Moreover, NCBI’s sequence read archive (SRA) is a significant repository of sequencing data coming from NGS reads from researchers across the globe. These SRAs are mostly publicly available and include genome and transcriptome data that is BLASTable. A search for T. intermedium in the SRA database yields over 4 thousand results, 184 of which are from reads coming from genome sequencing. Suffice to say, NCBI’s website is a significant source of information, especially for those who may not have access to funding their own NGS studies. However, due to the abundance of avenues in which data is deposited into their databases, curated navigation for specific species may be overwhelming. Specifically, when BLASTing against their database, many of the hits received may be outdated, or repeats of similar information.

Ensembl overcomes some of the pitfalls of NCBI by allowing users to select specific organisms to browse (Cunningham et al. 2022). Moreover, Ensembl plant removes species coming from Animalia, Fungi, and prokaryotes are removed to deconvolute searches for specific species. Although their database is not as robust as NCBI, the navigation of certain aspects is made much easier. Their biomart and downloads tabs allow for easy access to nucleotide and protein data for the species hosted by the website, which can be downloaded from a single web page. Ensembl plant stays up to date with current versions of reference genomes, including the newest versions of T. aestivum, Ae. tauschii, and H. vulgare, although old versions are still available. Another significant feature of Ensembl is their variation track that is available for some species. This feature allows for users to find variants of specific genomic regions, either found naturally or induced via chemical mutagenesis. By clicking on this feature, users are able to browse either the effects of these variations, or in some cases such as in T. aestivum, find accession numbers for mutant genotypes. This is very important for researchers who are looking for variants of candidate genes in gene cloning projects, making it easy to find knockouts and/or missense mutations in candidate regions. Unfortunately, very few relatives of wheat are available for BLAST, genome browsing, or data acquisition. Currently, diploid species T. urartu, Ae. tauschii, rye, and barley are the only diploid relatives of wheat that are accessible using this website.

For researchers who work specifically in small grains, GrainGenes is a curated database that has many features that are useful (Yao et al. 2022). Genome browsers are easy to find and available for several wild relatives of wheat, including the five accessions of Ae. tauschii, and three of the genomes in the Sitopsis section mentioned above. Additionally, BLASTing is robust, being able to select from many wild relatives, including all members of the Sitopsis section. Additionally, GrainGenes has an easy-to-use search for markers and probes found in literature. There also are some useful tools that are found in GrainGenes, including genome specific primer (GSP) design. The website, however, has become more cumbersome over the years as more and more data is being added to the site, though currently a more user-friendly interface is being developed.

A wheat specific database also exists in the form of URGI (Alaux et al. 2018; see also Chap. 2). This site allows for wheat-curated research in the form of BLASTs that can be performed on specific chromosomes for all available versions. This is important because many of the times, in the literature, different versions of reference genomes are used for research. This site, although not as user friendly as the previously mentioned databases, contains a significant amount of sequence data for wheat.

12.4.2 Germplasm Acquisition Resources

Researchers across the globe are willing to share material with one another for the greater good of assuring food security. Specifically in wheat research, seed requests can be performed from multiple sources. One such example is the Wheat Genetics Resource Center, hosted by Kansas State University. This site gives direct access to alien species coming from the aforementioned Sitopsis section, as well as multiple other species coming from Aegilops, such as Ae. geniculata. Along with this, there is access to Triticum species including diploid monococcum and urartu. WGRC also contains 95 unique accessions of Dasypyrum villosum coming from several different countries. Alien translocation lines with transfers coming from Aegilops, Dasypyrum, Triticum, Secale, and Agropyron species are directly accessible from this resource as well. This site links to other important germplasms and seed distributors such as CIMMYT and the USDA.

CIMMYT (The International Maize and Wheat Improvement Center) and the USDA utilize the Germplasm Resource Information Network (GRIN) or GRIN-global to give international institutions access to germplasms of several different species of plants, including wheat and some of its wild relatives. Although this resource is not specifically catered to wheat researchers, wild species belonging to Aegilops and Triticum are available. Similarly, Genesys is a resource for multiple different crop systems, but their user-friendly interface allows for easy search for species in Triticum. This site contains over 12 thousand accessions coming from Aegilops alone, and they are designated by subsets, including Aegilops core sets.

The OWWC, mentioned in Sect. 12.2.1, has their panel available through the Germplasm Resource Unit (GRU) hosted by the John Innes Centre. This resource has similar resources as the aforementioned sites; however, they have a core collection of Titiceae wild relatives that include Dasypyrum, Aegilops, Triticum, and Eremopyron. This site also contains seed resources for mutant, DH, and other mapping populations in wheat, as well as historical landraces.

12.5 Concluding Remarks

The ever-increasing breadth of knowledge coming from wheat and its relatives have large implications for improving the overall quality of cultivars in the coming years. This chapter gives an up-to-date overview of recent advances in genomic resources within wheat, highlighting the importance of wild relatives, and alien introgressions within the germplasm. The availability of the wheat pan-genome has allowed for researchers to trace introgressions that are present within cultivars across the world, some of these alien introgressions were found within the entire pan-genome, giving further evidence of the importance of the genetic diversities (Keilwagen et al. 2022). As more of these wild genomes get reference quality assemblies associated with them, the more we can learn about the important genes that lie within these species. On the OWWC website, a pan-genome of Ae. tauschii is currently underway, allowing researchers to get a more in depth understanding of the diversities that are present in these progenitor species. High-quality reference genomes are still required for some important species, such as T. elongatum, D. villosum, and Agrypyron species. Researchers would also benefit from pan-genomes representing other important wild relatives that have been mentioned in this chapter, such as that of wild diploid Triticum species, tetraploid Aegilops species.

Extensive resources for obtaining both genomic data as well as seed material for these species are available for public use, making further novel research possible across the globe. It is an exciting time to work in the field of wheat research with the ability to obtain diverse populations of not only bread wheat and its primary gene pool, but also members of secondary and tertiary gene pools from collaborators in different countries. The web-based resources that exist now make it possible for quick turnaround for not only basic scientific knowledge but also for the integration of this diversity into local breeding programs. A future prospect that could make this process even more efficient is a localized database where these independent seed and data repositories can be accessed. CIMMYT and the USDA make it easy to find material from either establishment by utilizing systems like GRIN and GRIN-global, which share germplasm requests; however, many other institutions do not utilize this as a means for requests and distribution, and further many of these are not necessarily catered toward wheat-based research. A similar central database would be beneficial for the amount of sequence data that is becoming available in Triticeae. A system to search for data pertaining to specific gene pools could prove to be beneficial for future research, especially as more genomes are being sequenced. The examples and information provided here will hopefully make it easier for researchers, students, and curious minds alike to find information pertaining to wheat and the many species that make up its gene pools.