Worldwide, ex situ genebanks are given the task to store seeds (and to some extent other plant material) to prevent the danger of extinction of plant genetic resources (Linington and Pritchard 2001). Today, genebanks are storing > 7.4 million accessions (FAO 2010) where ~ 45% are cereal species (Börner et al. 2014). A regular monitoring of their germination capacity is central to any genebank and any drop in that beyond a certain threshold determines their regeneration intervals. Seed longevity is defined as the maximum time period that seeds maintain germination viability (Sano et al. 2016).

Seed longevity varies among different species and could be influenced by several environmental factors during seed formation, harvest and storage (Liu et al. 2019). The deterioration in viability could be due to damage of the membranes, the DNA and to the action of a variety of enzymes and other proteins (Coolbear 1995; McDonald 1999). Among the agents responsible for seed ageing identified to date, lipid peroxidation seems most potent (Davies 2005; Wiebach et al. 2020), in addition to the damage of DNA and proteins (Rao et al. 1987; Bailly et al. 2008).

Although, seed longevity is a quantitative trait, certain major loci seem to exist. For example, mutations within the genes of DOG1 (DELAY OF GERMINATION1), and SNL1/2 (SWI-INDEPENDENT3-LIKE) in the model plant Arabidopsis are associated with seed longevity (Bentsink et al. 2006; Wang et al. 2013). In tobacco, Heat Shock Factor A9 over expression has been shown to enhance seed longevity by increasing the amount of heat shock proteins (Prieto-Dapena et al. 2006; Kotak et al. 2007).

Genetic analysis of seed longevity in crop plants was first initiated in rice (Miura et al. 2002), followed by soybean (Singh et al. 2008), barley (Nagel et al. 2009) and maize (Revilla et al. 2009). In bread wheat (Triticum aestivum), genetic research on seed longevity started with the use of microsatellite loci in a set of common wheat lines carrying D genome introgression segments of the wild ancestor Aegilops tauschii by Landejva et al. (2010) followed by Rehman Arif et al. (2012a) where the authors used bi-parental (RFLP, SSR markers) and association mapping (DArT markers) approaches to elucidate genetic loci for longevity in wheat. Furthermore, Rehman-Arif et al. (2017) mapped a range of loci in germplasm (183 accessions) selected from genebank of Gatersleben using DArT markers. More recently, using a population of 246 recombinant inbred lines (RILs), Zuo et al. (2018) identified 96 loci for seed vigor-related traits under artificial aging. To add to it, a further 23 longevity loci were uncovered in 166 RILs by Zuo et al. (2019). The last two studies were conducted using single nucleotide polymorphism (SNP) markers. Also in durum wheat (Triticum durum), loci linked to longevity were identified (Rehman Arif and Börner 2019). Nevertheless, genetic studies of seed longevity in wheat are still in early stages and new molecular marker data covering empty regions of previous studies may provide new insights into this trait.

Here, we report a re-analysis of two association mapping panels (a winter wheat and spring wheat collection investigated by Rehman Arif et al. (2012a, 2017), respectively) using the phenotypic data already available but newly created SNP marker data to look for potential novel loci linked to longevity, to search for possible candidate genes and to obtain a better understanding of the mechanisms in seed deterioration in wheat.

Materials and methods


The first reassessed germplasm set is composed of 96 winter wheat advanced lines (WW) which have been extensively investigated for agronomic traits, longevity, dormancy and pre-harvest sprouting (Neumann et al. 2011; Rehman Arif et al. 2012a, b). The second reassessed germplasm set is composed of 111 spring wheat accessions (SW) (Table S1) selected from the panel of 183 accessions reported by Rehman Arif et al. (2017). All 207 accessions of WP and SP were analyzed using a 15 K Infinium SNP array, which is an optimized and reduced version of the 90 K iSELECT SNP-chip described by Wang et al. (2014). Data of 11,139 and 9804 SNPs from public domain of IPK ( were used to find out the linked markers with seed longevity in WW and SW, respectively.


Phenotypic data from Rehman Arif et al. (2012a, 2017)) for WW and SW, respectively, were used. Briefly, to assess longevity, three replicates of 100 seeds each from both WW and SW were subjected to accelerated ageing (AA) and controlled deterioration (CDT) tests. For AA, seeds were exposed to 43 ± 1 °C for 3 days at 100% relative humidity followed by standard International Seed Testing Association (ISTA) germination test where three replicates of 100 seeds were placed between two layers of wet filter paper, formed into rolls and stood on Jacobsen apparatus at 25 ± 1 °C during the day and 23 ± 1 °C during the night. For CDT, moisture contents of the seed lots were brought to 18% after which they were sealed in an aluminum foil bag and exposed to 43 ± 1 °C for 3 days. The germination percentages were recorded after 7 days. Initial germination (IG; control), germination after AA (GAA) and germination after CDT (GCD) were determined. Relative values were calculated by dividing the GAA (RAA) and GCD (RCD) by IG × 100 for both WW and SW. RAA and RCD were used for association mapping analysis to determine longevity loci.

Genotypic data of both WW and SW were subjected to population structure analysis prior to association mapping. This was done using a subset of 241 and 229 evenly spaced SNPs for WW and SW, respectively. STRUCTURE v.2.3.4 software (Pritchard et al. 2000) was utilized applying the admixture model, a burn-in of 100,000 iterations and 100,000 MCMC duration to test for a K value in the range 1–15. The results were subjected to Structure Harvester (Earl 2012) to get the clear image of the sub-populations in both germplasm sets.

Association mapping was carried out using the program TASSEL 5.2.43 (Bradbury et al. 2007) employing mixed linear model (MLM) (Yu et al. 2006) which takes into account population structure (calculated from STRUCTURE v.2.3.4) and kinship (calculated from TASSEL 5.2.43). Significant p-values were calculated by taking reciprocal of number of markers for each set. Thus, p-values of 8.97 × 10−5 and 1.019 × 10−4 were considered significant to claim an association to be true in WW and SW, respectively. The flanking sequences of SNPs associated with longevity were obtained from the Wheat 90 K SNP array database (Wang et al. 2014). Gene ontology (GO) was assessed using BLAST2GO v.3 software (


Genotypic characterization

The distribution of SNPs in both collections was almost similar. A total of 11,139 SNPs were mapped to WW covering a distance of 3639.8 cM (3.06 SNPs/cM). Likewise, there were 9804 SNPs mapped to SW covering a distance of 3624.71 cM (2.70 SNPs/cM) (Table S2). Marker density was not uniform as B genome carried the highest number of SNPs (5479 in WW and 4831 in SW) followed by A genome (4313 in WW and 3843 in SW) whereas D genome was sparsely covered (1347 in WW and 1130 in SW) (Fig. S1).

Phenotypic variation

The phenotypic variations of IG, GAA, and GCD in both WW and SW indicate that IG was quite high in both WW and SW (Figs. S2, S3). Mean values in WW were 93.70 (SD 5.05), 10.97 (SD 12.08) and 61.05 (SD 25.33) for IG, GAA, and GCD, respectively. Likewise, mean values in SW were 85.78 (SD 11.94), 68.47 (SD 17.76) and 59.04 (SD 21.61) for IG, GAA and GCD, respectively. Mean RAA and RCD in WW, respectively, was 11.55 (SD 12.76) and 64.57 (SD 25.83) (Fig. 1). In SW, mean RAA was 79.87 (SD 16.90) and mean RCD was 69.03 (SD 21.55) (Fig. 2).

Fig. 1
figure 1

Frequency distribution of relative germination after AA (RAA, a) and CD (RCD, b) in WW. Black arrows indicate mean

Fig. 2
figure 2

Frequency distribution of relative germination after AA (RAA, a) and CD (RCD, b) in SW. Black arrows indicate mean

Population structure and association mapping

Population structure analysis revealed 3 subgroups in WW (Fig. S4) and 4 subgroups in SW (Fig. S5). Association analysis of WW uncovered a total of 16 associations on chromosomes 1A, 2A (10 associations), 2D, 6A (3 associations) and 7A with a p value < 8.97 × 10−5 for RAA whereas no association could pass this p value for RCD. Likewise, a total of 56 associations (10 with RAA and 46 with RCD) at p value < 1.019 × 10−4 were uncovered in SW. These were observed on chromosomes 1A, 1B (10 associations), 2A (2 associations), 2B (6 associations), 2D, 3A (2 associations), 3B, 4A (2 associations), 4B (16 associations), 5B (9 associations), 7A (3 associations), 7B (2 associations) and 7D with one association common to both RAA and RCD located on chromosome 5B (Fig. 3).

Fig. 3
figure 3figure 3

Longevity QTLs in WW (underlined SNPs) and SW. Markers in bold are reported using DArT marker system exploiting the same phenotypic data in either Rehman Arif et al. (2012a) (a) or Rehman Arif et al. (2017) (b). QTLs with asterisks are potentially novel


Genetic markers are very useful to identify regions and genes involved in seed longevity as they can provide information to germplasm curators and plant breeders when is it time to regenerate the seeds of a given accession of any species. Moreover, some candidate genes have been identified to influence the trait (Debeaujon et al. 2000; Clerkx et al. 2004; Sattler et al. 2004; Xu et al. 2004; Bentsink et al. 2006; Prieto-Dapena et al. 2006; Devaiah et al. 2007; Ogé et al. 2008; Rajjou et al. 2008; Almoguera et al. 2009).

In wheat, some studies have been reported linking longevity with SNPs (Zuo et al. 2018, 2019) investigating bi-parental mapping populations. In this study, we used SNPs in two association mapping wheat collections. The marker trait associations detected using the SNP data of WW and SW were low as compared to those reported by Rehman Arif et al. (2012a, 2017) because both of the reported studies used a default criterion of p value of 0.05 or 0.01. However, in this study, we have used p-values of 8.97 × 10−5 (for WW) and 1.019 × 10−4 (for SW) calculated by taking the reciprocal of number of tests (markers) performed per panel to help us to find true associations.

The associations discovered could be confined to 24 QTLs based on the marker proximities to each other (Fig. 3). Among them, 4 QTLs were observed in WW, 18 QTLs in SW and 2 QTLs were common to both WW and SW (Fig. 3). The loci were distributed on chromosomes 1A (2 QTLs), 1B, 2A (4 QTLs), 2B (2 QTLs), 2D, 3A, 3B, 4A (2 QTLs), 4B (2 QTLs), 5B (3 QTLs), 6A, 7A (2 QTLs), 7B and 7D. Of these, nine QTLs on chromosomes 1A (Q.Lng.ipk.1A(WW)) and Q.Lng.ipk.1A(SW), 1B (Q.Lng.ipk.1B(SW)), 2A (Q.Lng.ipk.2A.1(SW)), 2D (Q.Lng.ipk.2D(WW/SW)), 3B (Q.Lng.ipk.3B.SW), 4A (Q.Lng.ipk.4A.2(SW)), 4B (Q.Lng.ipk.4B.1.SW)) and 5B (Q.Lng.ipk.5B.2(WW)) have been reported before (Rehman Arif et al. 2012a, 2017). Moreover, three QTLs including two previously mentioned QTLs on chromosomes 2B, 3B and 4A (Q.Lng.ipk.2B.2(SWQ.Lng.ipk.3B(SW) and Q.Lng.SW.4A.2(SW)) have been reported by Zuo et al. (2019). To add to it, Q.Lng.ipk.2A.2(SW) is at the same position where Zuo et al. (2018) have located a mean germination rate QTL. Between 51 and 60 cM, Zuo et al. (2018) have found four QTLs linked to longevity related traits that coincide with our QTL (Q.Lng.ipk.3B(SW)) on 3B. The other 13 QTLs on chromosomes 2A (Q.Lng.ipk.2A.1(WW) and Q.Lng.ipk.2A.2(WW)), 2B (Q.Lng.ipk.2B.1(SW)), 3A (Q.Lng.ipk.3A(SW), 4A (Q.Lng.ipk.4A.1(SW), 4B (Q.Lng.ipk.4B.2(WW), 5B (Q.Lng.ipk.5B.1(SW) and Q.Lng.ipk.5B.3(SW)), 6A (Q.Lng.ipk.6A.(WW)), 7A (Q.Lng.ipk.7A(WW/SW) and Q.Lng.ipk.7A(SW), 7B (Q.Lng.ipk.7B(SW) and 7D (Q.Lng.ipk.7D(SW)) are potentially novel. The new QTLs in this study highlight the importance of proper genome coverage to identify almost of the related loci influencing the trait of interest.

BLAST analysis of the 55 marker sequences in association with longevity belonging to 16 QTLs revealed a total number of 37 genes probably involved in seed longevity (Table S3). Using the deletion bin confinement of DArt markers, Rehman Arif et al. (2012a, 2017) have reported a number of probable candidate genes for longevity. In this report, we confined the probable candidate genes to 37 which can be potentially targeted for advanced molecular research towards seed longevity in wheat. Zuo et al. (2018) reported five candidate genes from the analysis of 96 QTLs in RILs. The common candidate genes between Zuo et al. (2018) and this study are stem rust resistance protein Rpg1 and NBS-LRR resistance-like protein. Likewise, Zuo et al. (2019) reported three candidate genes for longevity in wheat from 23 QTLs in 166 RILs. Common candidate gene to Zuo et al. (2019) and this study is FAR1-related sequence 6-like protein which is expressed in hypocotyls, rosette and cauline leaves, inflorescences stems, flowers and is linked to positive regulation of circadian rhythm and transcription (Lin and Wang 2004). Moreover, it is also reported to be involved in ABA signal transduction and abiotic stress response pathways (Ma and Li 2018).

In SW, we divided the accessions in two groups carrying 35 accessions each. The first group (a) carried between 3 and 11 favorable alleles and the second group (b) carried between 14 and 17 favorable alleles for the 20 QTLs reported in SW. Mean RAA and RCD in group (a) were 69.25 ± 18.90 and 56.85 ± 26.37, respectively. Likewise, mean RAA and RCD in group (b) were 84.25 ± 15.94 and 77.86 ± 12.83, respectively. Thus, with the pyramiding of favorable alleles, an increase of 5.47% and 12.79% could be witnessed, respectively, in RAA and RCD (Table 1). This confirms that seed longevity is a polygenic trait with each locus imparting some improvement in an additive manner (Zuo et al. 2019) with the accessions carrying more favorable alleles showed higher longevity.

Table 1 Mean RAA, RCD and increase or decrease in RAA and RCD from population in group (a) and group (b) in SW with respect to favorable QTLs


Our analysis discovered 13 potentially novel loci for seed longevity using SNP whole genome mapping in two different association mapping populations in wheat. These novel loci were unnoticed in previous reports. Furthermore, this highlights the importance of dense genetic maps covering the otherwise uncovered genome parts to detect novel loci for seed longevity. Moreover, since more and more populations are being characterized with SNPs, the results of this investigation will help genebank curators and plant breeders to decide about regenerating their germplasm.