Introduction

An estimated 345,777 species of vascular plants [lycophytes, pteridophytes and seed plants (gymnosperms and angiosperms)] are known to science, of which 332,857 species are seed plants (WCVP 2020). Of these, assessments of the proportion of species threatened with extinction can vary from ca 22% (or, one in five, Brummitt et al. 2015; RBG Kew 2016) to 36–37% (Bachman et al. 2017). While in situ conservation of species and habitats is usually considered to be optimal, the Global Strategy for Plant Conservation (GSPC), recognises the significant, complementary role of ex situ conservation to achieve its TargetsFootnote 1 8 and 9 by 2020 (CBD 2012).

Seeds represent the next generation of plants when they germinate and develop into seedlings. They are the primary propagule used for regeneration and reintroduction of plant species in ecological restoration to mitigate environmental degradation and species extinction (Broadhurst et al. 2008; León-Lobos et al. 2012; Elzenga and Bekker 2017). Conventional seed banking, whereby seeds can be preserved dried and deep frozen for many tens, if not hundreds of years, is identified as a valuable ex situ conservation tool for integrated plant conservation, both as an archive and a source of genetic variation (Gargiulo et al. 2019). It is also a practical, efficient and attractive method due to its low cost and high storage capacity. As seed banks are well placed to address both GSPC targets 8 and 9, the number of ex situ conservation facilities for wild plants has grown dramatically, and ex situ conservation of threatened plants has become a national and global priority (Li et al. 2010; CHABG 2011; León-Lobos et al. 2012; Hay and Probert 2013; Liu et al. 2018). To ensure the use of such conserved germplasm, seed collections must be of high quality and viability, and in the correct physiological state to germinate and establish seedlings (Godefroid et al. 2010; León-Lobos et al. 2012). There are more than 1750 conventional seed banks in the world and the majority focus on conserving germplasm of crop species and their closest wild relatives; while others focus on species of global or national economic importance (e.g. horticultural crops, fruits and timber species), or wild species (Hay and Probert 2013), sometimes those that are most threatened.

In response to GSPC Target 8, around half of European threatened species have been conserved ex situ in seed banks (Rivière et al. 2018); and 41% of known threatened species are conserved by botanic gardens around the world (Mounce et al. 2017). However, while reliance on conventional seed banking appears reasonable for the majority of crop wild relatives (except tropical species such as coffee, coconut, cocoa, etc.) under Target 9, Wyse et al. (2018) have shown it is unlikely to be so for achieving Target 8, particularly for threatened species, and especially when they represent tropical forest tree species. This is because of the estimated high proportions of threatened and tropical moist forest species that bear seeds that are sensitive to the drying required for conventional seed banking and cryopreservation may be the only resource to ensure the effective ex situ seed conservation of such species (Li and Pritchard 2009; Hay and Probert 2013).

The Millennium Seed Bank (MSB), managed by the Royal Botanic Gardens, Kew (RBG Kew) is the culmination of ex situ seed conservation that began at RBG Kew in the 1960s (Dickie 2018). Where possible, seeds are collected and conserved in the country of origin with duplicates being sent to RBG Kew’s − 20 °C long-term storage at the MSB. The global partnership associated with the MSB has led to the creation of an extremely valuable and rich biological resource of seed collections representing substantial taxonomic diversity, wide geographic coverage, notable uniqueness and irreplaceability, significant natural capital and population value and high-quality germplasm (Liu et al. 2018). Although gaps in coverage of threatened plants were noted for MSB germplasm, at least 10% of taxa (including species and subspecific epithets), represented by over 6703 collections (> 8% of total holdings), are either extinct, rare or vulnerable to extinction at the global and/or national scale (Liu et al. 2018). This includes at least 667 taxa, representing nearly 1600 collections, that are declared globally threatened (EW-extinct in the wild, CR-critically endangered, EN-endangered or VU-vulnerable) by the IUCN (2016). Under-representation of threatened taxa may be linked to geographic rarity, or to taxa bearing desiccation sensitive (recalcitrant) seeds, which are not suitable for preservation in conventional seed banks (Liu et al. 2018). Since the study of Liu et al. (2018), there have been new or revised global level IUCN Red List assessments and a net increase in overall collections of threatened taxa conserved at the MSB.

Restoration, revegetation and species reintroduction programmes often demand vast quantities of high quality and genetically diverse germplasm (seeds) to establish self-sustaining populations and maximise the adaptive potential of conservation efforts to current and future environmental change. To support this, and help plan future more efficient seed sampling and seed conservation activities, knowledge of population structure and seed quality, viability, germinability, and vigour are needed in addition to information on how to break seed dormancy and germinate the seeds (Mortlock 2000; Cochrane et al. 2007; Broadhurst et al. 2008; Merritt and Dixon 2011; Hay and Probert 2013). The objective of this study is to review the status of seed collections of globally threatened plants conserved in − 20 °C long-term storage at the MSB, RBG Kew in terms of their geographic and bioclimatic representativeness, taxonomic and genetic diversity, quality and physiological status. Our aim is to provide useful information to support plant conservation and research worldwide by identifying strengths and weaknesses on the status of collections and suitable germination protocols for propagation of these plants from seeds.

Materials and methods

The MSB’s Seed Bank Database (SBD) contains in-depth data on seed collections stored at the MSB, and plays a significant role in collection acquisition, curation, management, monitoring, prioritisation and reporting. Data are gathered from the point of seed collection in the field (field or passport data) and during the lifespan of seeds in storage (processing data). By matching plant names (exact match), data for collections representing globally threatened seed plants according to IUCN (2017) were extracted from SBD on 08 January 2018. It is possible that plant names on IUCN (2017) could match indirectly to MSB collections (Liu et al. 2018), but these were excluded in our review. There were a total number of 1172 collections with exact name matches representing 569 taxa (number of collections followed by taxa in brackets): EW (21, 7); CR (304, 125); EN (340, 195) and VU (507, 242). As seed germination protocols need to be identified from collections with high taxonomic certainty, only those collections with verified names and with at least one completed post-storage germination test were included in this study to establish physiological status. Consequently, the representative sample included 528 collections of 303 taxa. Seed conservation protocols related to subsequent paragraphs are discussed in Appendix 1.

Geographic and bioclimatic representativeness

Cultivated collections inherited the geographic origin of the wild plant population from which they were propagated or regenerated. To investigate any degree of geographical bias, locality data were analysed at the bio-geographic continent and country level to understand the representation of taxa across the globe. To illustrate the geographic origin of taxa according to political boundaries of countries, ISO country level data were processed and displayed in ArcGIS Desktop 10.5 (ESRI 2012). To examine the bioclimate of the habitat from which seeds were harvested, the geographic coordinates of the collections were mapped according to Köppen–Geiger climate classification (Köppen 1936).

Taxonomic and genetic diversity

To investigate any degree of taxonomic bias, the taxonomic composition of the collections (e.g. number of families, genera, species and taxa) was examined to understand the diversity of globally threatened plants represented at the MSB. We used the phylogenetic tree of Smith and Brown (2018), reduced to include only one tip for each family of angiosperms and gymnosperms, to visualise family level distribution pattern of species in the sample. Species were assigned to families following World Checklist of Vascular Plants (WCVP 2020).

Wild plants represent an extremely rich genetic resource, harbouring useful genes not available in the cultivated gene pool (Ceoloni et al. 2017). Therefore, it is important to capture wild genetic diversity within plants, especially for rare, threatened and economically important species (Neel and Ellstrand 2003; Laikre 2010). Adopting a single spatial sampling strategy and sample size for all species will likely lead to suboptimal collections with low genetic diversity, and ideal collection sizes vary widely even within genus (Hoban et al. 2020). It is important to consider population structure and adjust the sampling strategy to capture locally restricted alleles or traits, which have high conservation, ecological, or economic value (Hoban and Schlarbaum 2014). While, for example, the phylogenetic diversity of the MSB legume collections (Griffiths et al. 2015) and the genetic diversity in MSB European yew collections made in the UK (Gargiulo et al. 2019) have been assessed in separate studies, such a strategy is not currently possible for most of the MSB’s collections, as the requisite genetic data is not available for them.

In the absence of genetic analysis on the germplasm, the number of collections per taxon and the number of seeds per collection were used as surrogates for genetic diversity (Godefroid et al. 2011). Such methods only demonstrate potential or likely genetic diversity within and among collections as the breeding biology or population genetics of the germplasm have not been assessed. To ensure that representative genetic diversity is captured from a population, and to enable the use of seeds for viability monitoring, regeneration, distribution, research and conservation, it is recommended to harvest 10,000–20,000 seeds from at least 50 individual plants in one population for long-term conservation (Way 2003). To capture the genetic diversity of an individual taxon alone, at least five collections should be made, each containing 5000 seeds and originating from a different wild population (Brown and Briggs 1991; ENSCONET 2009).

To review the genetic diversity of a population captured, we inspected data on the total number of individual plants harvested to make the original collection. Data was available for 280 of the 523 collections. Where the number of plants harvested was indicated by a range (e.g. 25–50 plants), we converted it to a rounded-up middle value (in this case 38). Where a > sign had been used (e.g. > 100 plants), we converted the number to one value less than the subsequent place value for tenth, hundredth or thousandth (in this case 199). The quantity of potentially viable seeds or usable seeds conserved for collections is covered in “Quality of seeds”.

To estimate the level of genetic diversity captured for a taxon we examined data on the total number of collections and potentially viable or usable seeds currently conserved from different populations of that taxon. As we excluded collections with unverified names and/or without at least one completed post-storage germination test, this may underestimate the genetic diversity captured in the taxa. Therefore, only in this instance, for 303 of the taxa represented in the reviewed collections, we added corresponding collections (314 in total) that we excluded initially. Of the 314 collections added, seed quantity data are not available for 33 collections.

We also assigned the biological status of collections as of wild or cultivated origin by examining heredity, geographic origin, habitat, taxonomy and regeneration data. Collections originating from natural or semi-natural habitats were classified as ‘wild origin’, while those originating from cultivated habitats and propagation or regeneration activities were classified as ‘cultivated origin’ (Alercia et al. 2012).

Quality of seeds

Radiographic analysis has proved to be an effective non-destructive method of monitoring seed quality by identifying empty, insect damaged and malformed seeds and enabling the estimation of the number of potentially viable or usable seeds in a collection (Guedes et al. 2014). This technology is used at a number of wild species seed banks as an alternative to cut-testing seeds for assessing the quality of collections as well as to determine the number of seeds that need to be sown to compensate for embryoless and infested seeds when setting up germination tests.

We reviewed the quality of collections in terms of the total number of potentially viable or usable seeds as well as cumulative percentage of unusable seeds observed over time by using seed X-ray or cut-test data generated during seed cleaning and germination tests (see Appendix 1). The initial quality indicates the status of the original collections received post-cleaning but pre-banking in − 20 °C long-term storage at the MSB. The current quality indicates the status of collections at the time of review, after long-term storage at − 20 °C, and this quality varies over time (e.g. reduction of potentially viable seeds when more unusable seeds are observed over time or seeds are used for curation, conservation, education and display activities). Where the sample size was sufficient, we analysed for taxonomic patterns in the proportions of seeds lacking embryos or infested.

Initial quality

The initial quality of seeds in the original collections was assessed by estimating the total number of potentially viable or usable seeds (AOq, adjusted original seed quantity) using seed X-ray or cut-test data generated at the seed cleaning stage as a proportion of the original seed quantity (Oq)

$${\text{AO}}_{{\text{q}}} = \frac{{{\text{O}}_{{\text{q}}} \times {\text{X}}_{{\text{f}}} }}{{{\text{X}}_{{\text{n}}} }}$$

where \({\text{X}}_{{\text{f}}}\) is the number of full seeds observed and \({\text{X}}_{{\text{n}}}\) is the total number of seeds that were X-rayed or cut-tested. Any significant loss of usable seeds due to empty and infested seeds in the original collections was investigated by testing the null hypothesis that there was no difference in total quantities between \({\text{O}}_{{\text{q}}}\) and \({\text{AO}}_{{\text{q}}}\), using a one-tailed paired (right) probability corresponding to t statistic using GenStat software (VSN International Ltd).

Current quality

The current quality of seeds was assessed by estimating: (1) the total number of potentially viable or usable seeds in the current collection (\({\text{AC}}_{{\text{q}}}\), adjusted current seed quantity) using seed X-ray or cut-test data generated at the seed cleaning stage as a proportion of the current seed quantity (\({\text{C}}_{{\text{q}}}\))

$${\text{AC}}_{{\text{q}}} = \frac{{{\text{C}}_{{\text{q}}} \times {\text{X}}_{{\text{f}}} }}{{{\text{X}}_{{\text{n}}} }}$$

where \({\text{X}}_{{\text{f}}}\) is the number of full seeds observed and \({\text{X}}_{{\text{n}}}\) is the total number of seeds X-rayed or cut-tested; and (2) the cumulative percentage of unusable seeds (\({\text{CPU}}\)) observed overtime, by using seed X-ray or cut-test data generated at the seed cleaning stage and seed cut-test data from ungerminated seeds derived from all routine germination tests carried out for monitoring the viability of collections.

$${\text{CPU}} = \left( {\frac{{{\text{X}}_{{\text{e}}} + {\text{X}}_{{\text{i}}} + \sum \left( {{\text{G}}_{{\text{e}}} + {\text{G}}_{{\text{i}}} } \right)}}{{{\text{X}}_{{\text{n}}} + \sum {\text{G}}_{{\text{n}}} }}} \right) \times 100$$

where \({\text{X}}_{{\text{e}}}\) is the number of empty and \({\text{X}}_{{\text{i}}}\) is the number of infested seeds observed, and \({\text{X}}_{{\text{n}}}\) is the total number of seeds that were X-rayed or cut-tested, and where \({\text{G}}_{{\text{e}}}\) is number of empty and \({\text{G}}_{{\text{i}}}\) is the number of infested ungerminated seeds observed and \({\text{G}}_{{\text{n}}}\) is the total number of seeds sown, all for a germination test. The average values of CPU were estimated for families with at least five different genera, genera with at least five different species and species with at least five different collections.

Physiological status of seeds

The physiological status of seeds reflects their capacity to germinate, highlighting the true quality of collections. Seed germination tests are the most useful way of monitoring the physiological status of seeds over time in long-term storage as well as enable the development of seed germination protocols for the propagation of taxa from seeds. At the MSB, the standard is that there should be 95% certainty that the lower bound of germinability of collection is at least 75%, hence the expected overall germinability must be 85% or more. Anything less will lead to management decisions being taken to either regenerate and/or recollect the taxon concerned. Seed germination protocols used at the MSB are standardised with conditions and/or treatments to break more abundant seed dormancies. Details of germination test protocols are described in Appendix 1. We determined the physiological status of seeds from the results of their germination tests carried out after storage in − 20 °C at the MSB (post-storage tests). The initial round of germination tests was used to analyse the relative percentages of seed germination (\({\text{RG}}\)) and viability (\({\text{RV}}){ }\) and seed vigour (\({\text{R}},\) the index of germination rate or speed). Both initial and most recent retest rounds of germination tests were used to assess the longevity of collections in long-term storage, in terms of germinability.

Seed germination, viability and vigour

The true quality of mature seeds is usually reflected in the results of germination tests when seeds are exposed to optimum conditions (Godefroid et al. 2010). If the germination conditions are not optimal or incubation periods are extended, there is a risk that some viable seeds might die during the germination test (Hay and Probert 2013). Occasions may well arise when the results of germination tests could be misleading due to application of ineffective or partially effective dormancy-breaking treatments (Ellis et al. 1985). Also, when seeds are stored, they deteriorate with time losing their fitness due to aging prior to mortality (Walters et al. 2010). This may result in some seeds losing their ability to germinate or to produce healthy radicles which can grow into healthy or normal seedlings and plants. Therefore, it is critical to identify emergence of healthy radicles versus seeds developing abnormal seedlings (e.g. cotyledons produced without a radicle, indicative of accumulating genetic damage in ageing stored seeds) during seed germination tests (e.g. Roberts 1978). Abnormal seedlings are those that are not considered capable of continued growth and development due to damage, deformation or decay. Their numbers are excluded from RG calculation but included in RV calculation as they indicate a certain, minimal degree of seed viability.

A subjective measure of the viability of a collection at the end of a germination test is often calculated by including the cut-test results of ungerminated seeds (Crawford et al. 2007; Godefroid et al. 2010). As viable seeds may have died during the incubation period, ungerminated seeds that are firm and appear fresh are considered as an indication of potentially viable dormant seeds, but not an indication of overall viability at the start of the germination test (Crawford et al. 2007).

To minimise the incorrect estimation of low germination percentages, especially for taxa which are known to produce many embryoless seeds, the original number of seeds sown (\({\text{G}}_{{\text{s}}}\)) was adjusted to reflect the true number of potentially viable seeds sown (with embryos) by discounting any empty (\({\text{G}}_{{\text{e}}}\)) and infested (\({\text{G}}_{{\text{i}}}\)) ungerminated seeds (de Santana et al. 2018). For each collection, the following variables related to seed germination were calculated.

$${\text{RG}} = \left( {\frac{{{\text{G}}_{{\text{g}}} }}{{{\text{G}}_{{\text{s}}} - \left( {{\text{G}}_{{\text{e}}} + {\text{G}}_{{\text{i}}} } \right)}}} \right) \times 100$$
$${\text{RV }} = \left( {\frac{{{\text{G}}_{{\text{g}}} + {\text{G}}_{{\text{f}}} + {\text{G}}_{{\text{a}}} }}{{{\text{G}}_{{\text{s}}} - \left( {{\text{G}}_{{\text{e}}} + {\text{G}}_{{\text{i}}} } \right)}}} \right) \times 100$$
$${\text{R}} = \left( {\frac{{\sum \left( {{\text{g}},{\text{t}}} \right)}}{{\sum {\text{g}}}}} \right)$$

where \({\text{G}}_{{\text{g}}}\) and \({\text{G}}_{{\text{f}}}\) are respectively the total number of germinated seeds with healthy radicles and ungerminated seeds which appear fresh, \({\text{G}}_{{\text{a}}}\) is the total number of germinated seeds that produce abnormal seedlings, \({\text{t}}\) is the time from start of the germination period in days and \({\text{g}}\) is the number of newly germinated seeds at time t (Soltani et al. 2015).

The test with > 9 true seeds sown (with embryos) and yielding the highest \({\text{RG}}\) followed by the highest \({\text{RV}}\) was used to describe the initial physiological status of a collection (\({\text{RG}}\), \({\text{RV}}\) and \({\text{R}}\)). If more than one test yielded equally high \({\text{RG}}\) and RV for a collection, the test with highest number of true seeds sown or the test where seeds germinated within the shortest period of time, as indicated by \({\text{R}}\), was used.

Seed longevity

Based on the availability of data the following variables were estimated and analysed to understand the longevity of collections: (1) true age of collections from the year seeds were harvested to current; (2) number of years seeds were being stored in − 20 °C long-term storage at the MSB; (3) any significant loss in \({\text{RG}}\) since the collections were first placed in − 20 °C long-term storage at the MSB. The latter was calculated by comparing the highest \({\text{RG}}\) achieved for the initial round of germination tests with that of the most recent retest. The null hypothesis that both these values are the same was tested using the value of normal deviate Z from a two-tailed test according to Ellis et al. (1985), where a correction factor is applied to enable the normal distribution to be used in the analysis:

$${\text{Z}} = \frac{{\left( {{\text{p}}_{1} - {\text{p}}_{2} } \right)}}{{\sqrt {{\overline{\text{p}}}} \left( {100 - {\overline{\text{p}}}} \right) \left( {(1/{\text{n}}_{1} ) + (1/{\text{n}}_{2} )} \right)}}$$

where \({\text{p}}_{1}\) is \((100 \times {\text{g}}_{1} {/}({\text{n}}_{1 } - 0.5))\), \({\text{p}}_{2}\) is \((100 \times {\text{g}}_{2} {/}({\text{n}}_{2 } + 0.5))\), \({\overline{\text{p}}}\) is the mean value of \({\text{p}}_{1}\) and \({\text{p}}_{2}\), and \({\text{g}}_{1} { }\) and \({\text{g}}_{2}\) are number of germinated seeds, and n1 and n2 are number of true seeds sown (with embryos) in initial and most recent retest rounds of germination tests respectively; and (4) correlation between the number of years that seeds have been stored in − 20 °C long-term storage at the MSB and the change in RG during this period (the difference in the highest RG achieved for the most recent retest when compared to that of initial round of germination tests after storage). A sample Pearson correlation coefficient, r, was calculated to examine any linear correlation between these variables using GenStat software (VSN International Ltd).

Seed germination protocols

Suitable germination protocols for a taxon were chosen from post-storage initial germination tests, across collections if represented by more than one collection, where the number of true seeds sown (with embryos) was > 9 and where \({\text{RG}}\) was at least 70%. The aim was to identify the best germination test where: seeds were exposed to a single constant or alternate incubation temperature; no dormancy breaking conditions and/or treatments were used; the highest \({\text{RG}}\) followed by the highest \({\text{RV}}\) were achieved; and seeds germinated within the shortest period of time as indicated by R.

Results

The MSB is a duplicate storage facility for conserving a portion of the original harvest for 61% of the 523 collections reviewed. The other portion was conserved in the country of origin and/or elsewhere. The proportionate division of the share to MSB is unknown but is generally 50%. Where a portion is not stored in the country of origin, it is usually because, either suitable facilities were not available in-country, or the collection was regarded as being too low in seed number to be split, without potentially compromising the population genetic diversity represented in each sub-sample.

Geographic and bioclimatic representativeness

The geographic origin of the collections covered all nine bio-geographic continents and represented 67 countries. Some taxa were collected from more than one continent and/or country (Fig. 1). Fifteen collections (~ 3%) representing 15 taxa had unknown geographic origins. Most taxa and collections originate from Africa and the least from the Pacific, Asia-Tropical and Antarctic (Fig. 2). Countries where > 10 globally threatened taxa were represented (number of taxa followed by collections in brackets): Madagascar (27, 35); Chile (17, 26); Saint Helena, Ascension and Tristan da Cunha (16, 30); Italy (15, 21); South Africa (14, 18); Australia (13, 39); Tanzania (13, 27); Kenya (13, 14); USA (12, 27); Mauritius (12, 15); Georgia (11, 12); and Namibia (11, 11).

Fig. 1
figure 1

The geographic origin of globally threatened plants conserved at the MSB, RBG Kew. The representative sample included 523 collections from 303 seed plant taxa. Cultivated collections inherited the geographic origin of the wild plant population from which they were propagated or regenerated. Total number of taxa are shown according to different size classes. Some taxa originated from more than one continent and/or country. Countries that are not shown on the map are (number of taxa in brackets): Bermuda (5); British Virgin Islands (1); Cayman Islands (3); Falkland Islands (5); Mauritius (12); Saint Helena, Ascension and Tristan da Cunha (16); and Turks and Caicos Islands (6)

Fig. 2
figure 2

Percentage of seed collections and taxa originating from each bio-geographic continent for globally threatened plants conserved at the MSB, RBG Kew. Some taxa originated from more than one continent

Köppen–Geiger climate for habitats where the wild-collected seeds were harvested was mapped for 350 collections representing 207 taxa (Table 1; Fig. 3). Seeds were collected from all five Köppen–Geiger climate groups representing 21 climate zones out of 33, with majority of collections and taxa (in brackets, respectively) originated from temperate climate (184, 90) followed by tropical (84, 64), arid (50, 40), cold continental (26, 16) and polar (6, 3). Regarding the seasonal precipitation level of these habitats, the majority of collections (in brackets followed by number of taxa) originated from habitats with no dry season (104, 38) followed by those with a dry summer (88, 67), wet savanna (50, 34), dry winter (45, 35), unknown precipitation level (23, 13), monsoon (15, 15), dry savanna (15, 15), tundra (6, 3) and rainforest (4, 3). In terms of the level of heat, the majority of collections and taxa (in brackets, respectively) were collected from habitats experiencing a warm summer (133, 59) followed by unknown heat level (90, 70), hot summer (69, 49), cold (33, 26), hot (17, 15) and cold summer (8, 4). Some taxa were collected from more than one climate zone.

Table 1 Köppen–Geiger climate (Köppen 1936) of wild habitats where seeds originated from for collections of globally threatened plants conserved at the MSB, RBG Kew
Fig. 3
figure 3

Köppen–Geiger climate of wild habitats from where seeds were harvested for globally threatened plants conserved at the MSB, RBG Kew. Sample included 350 collections representing 207 taxa. Refer to Table 1 for climate abbreviations

Any bias in geographic or bio-climatic representation would be no surprise and probably expected. Early, pre-MSB seed collecting expeditions for the Kew seed bank focused on the UK and Europe, especially the Mediterranean region; and for much of the MSB Project (2000–2010) the focus was on dryland areas in a number of countries and regions.

Taxonomic and genetic diversity

There were 303 taxa represented by 39 gymnosperms and 264 angiosperms from 74 families, 199 genera and 297 species. The IUCN (2017) global conservation status for these taxa were (number of taxa in brackets): EW (4); CR (56); EN (105); and VU (138). The distribution of the number of globally threatened species conserved among seed plant families is illustrated in Fig. 4. Although there are gaps in the distribution, species occurred throughout the phylogenetic tree. Families with at least 10 taxa represented were (number of taxa in brackets): Fabaceae or Leguminosae (37); Pinaceae (21); Asteraceae or Compositae (18), Cupressaceae (16), Cactaceae (14); and Rubiaceae (10).

Fig. 4
figure 4

Phylogenetic distribution of globally threatened plants conserved at the MSB, RBG Kew. The phylogenetic tree is a version of Smith and Brown (2018) reduced to include only one tip for each family of angiosperms and gymnosperms. Species were assigned to families following WCVP (2020). Bar charts in the middle-grey area indicate the number of species represented for each family. Scale is given with a middle-dotted line which indicates representation of 20 species

About 79% of collections originated from natural or semi-natural habitats (wild origin) and 21% from either cultivated habitat (e.g. orchards, home gardens and botanic gardens) or propagation/regeneration activities in the UK or elsewhere (cultivated origin). The total number of individual plants sampled to make the original collection varied from one to > 1000 plants: 50 to > 1000 (26%); 10–49 (33%); and < 10 (41%). The total number of collections conserved for a taxon from different populations varied from one to 31: one collection (52%); 2–4 collections (35%); 5–10 collections (10%); and 11–31 collections (3%). The total number of potentially viable or usable seeds currently conserved for a taxon varied from 12 to 1,972,356 seeds: < 501 (10%); 501–1000 (16%); 1001–1500 (9%); 2001–2500 and 1501–2000 (8% each); 2501–5000 (13%); 5001–10,000 (11%); and > 10,000 (25%).

Quality of seeds

Due to the availability of seed X-ray and cut-test data, adjusted seed quantities (the quantity of potentially viable seeds or usable seeds) for original and current collections were estimated for only 510 of the collections.

Initial quality

The estimated original seed quantity (\({\text{O}}_{{\text{q}}}\)) ranged from 39 to 2,012,812 seeds and the adjusted original seed quantity (\({\text{AO}}_{{\text{q}}}\)) ranged from 31 to 1,972,556 seeds. About 39% of the original collections consisted of all potentially viable or usable seeds but for the rest, the quantity was reduced by 2% to 90% due to empty and infested seeds (Figs. 5, 6). As a result, original collections with > 2500 seeds were reduced by 17% (by 36 from 216 collections), those with 2001–2500 seeds increased by 16% (by 5 from 31 collections), collections with 1501–2000 seeds were reduced by 19% (by 10 from 53 collections), and those with ≤ 1500 seeds were increased by 20% (by 41 from 210 collections). Therefore, about 21% of the original collections contain > 5000 and 18% contain < 501, potentially viable or usable seeds. The pairwise comparison of collections for their total quantities of \({\text{O}}_{{\text{q}}}\) and \({\text{AO}}_{{\text{q}}}\) confirmed a significant decline in the number of potentially viable or usable seeds due to empty and infested seeds in the original harvest (results of one-tailed paired t-test: t = 4.20 on 509 d.f. and probability < 0.001 at 95% confidence level).

Fig. 5
figure 5

Percentage reduction of potentially viable or usable seed quantity due to empty and infested seeds in the original harvest of globally threatened plants conserved at the MSB, RBG Kew

Fig. 6
figure 6

Estimated seed quantities for collections of globally threatened plants conserved at the MSB, RBG Kew: percentages of collections falling under different size classes of original (Oq), adjusted original (AOq) and adjusted current (ACq) seed quantities

Current quality

At the time of this review the estimated adjusted current seed quantity (\({\text{AC}}_{{\text{q}}} )\) ranged from 31 to 1,972,356 seeds. About 21% of the current collections contain > 5000 and 19% contain < 501, potentially viable or usable seeds (Fig. 6). The estimated cumulative percentage of unusable seeds observed overtime (\({\text{CPU}}\), sample size = 508 collections), ranged from zero to 85%. About 24% collections consisted of all potentially viable or usable seeds (CPU = 0%), for 36% of collections the \({\text{CPU}}\) was between 1 and 10%, for 29% of collections it was between 11 and 50, and for 11% of collections it was > 50% (Fig. 7).

Fig. 7
figure 7

Cumulative percentage of unusable seeds (CPU) observed in collections of globally threatened plants conserved at the MSB, RBG Kew: percentages of collections falling under different size classes of CPU

Average CPU values were estimated for 11 families, eight genera and 15 species (Table 2). Collections from four families (Cupressaceae, Pinaceae, Apiaceae and Asteraceae or Compositae), three genera (Abies, Pinus and Dalbergia) and four species (Abies fraseri, Callitris oblonga, Pinus tecunumanii and Widdringtonia whytei) appeared to consist of 20% or more unusable seeds.

Table 2 Cumulative percentage of unusable seeds (\({\text{CPU}}\)) observed overtime for collections of globally threatened plants conserved at the MSB, RBG Kew

Physiological status of seeds

To estimate the physiological status of seeds, post-storage germination data for a total number of 1099 initial tests across 523 collections and a total number of 140 most recent retests across 78 collections were analysed. For initial tests the true number of seeds sown (with embryos) ranged from 10 to 298 for 84% of tests and was < 10 for 16% of tests, and for retests the true number of seeds sown (with embryos) ranged from 10 to 75 for 96% of tests and was < 10 for 4% of tests.

Seed germination, viability and vigour

The relative germination (\({\text{RG}})\) and viability (\({\text{RG}})\) for post-storage initial germination tests ranged from zero to 100% with an average of 70% and 79% respectively for collections and 67% and 77% respectively for taxa (Fig. 8). About 57% and 66% of collections and 56% and 66% of taxa respectively showed high \({\text{RG}}\) and \({\text{RV}}\) (> 80%). By comparison, 28% and 20% of collections and 32% and 22% of taxa respectively showed low \({\text{RG}}\) and \({\text{RV}}\) ≤ 50%. About 9% of collections (~ 12% of taxa) showed 0% RG and about 4% collections (5% of taxa) showed both 0% \({\text{RG}}\) and \({\text{RV}}\).

Fig. 8
figure 8

Relative germination (RG) and viability (RV) percentages at post-storage initial germination tests for seed collections of globally threatened plants conserved at the MSB, RBG Kew: percentage of collections falling under different size classes of RG and RV

The index of germination rate or speed \(\left( {\text{R}} \right)\) was calculated for 497 collections, representing 295 taxa. \({\text{R}}\) ranged from zero to 405 days with an average of 25 days (Fig. 9). About 62% of collections germinated within three weeks of sowing on media and exposure to an incubation temperature, while ~ 11% of collections continued beyond the expected six-week period for germination (range from 45 to 405 days).

Fig. 9
figure 9

Index of germination rate or speed (R) at post-storage initial germination test for seed collections of globally threatened plants conserved at the MSB, RBG Kew: percentage of collections falling under different size classes of R

Seed longevity

The date of seed harvested was available for 496 collections. The true age of collections ranged from three to 46 years (Fig. 10a): the majority were aged either 6–10 years (35%) or 11–15 years (~ 35%), followed by 1–5 years (12%). Collections have been stored in − 20 °C long-term storage at the MSB for 1–45 years (Fig. 10b): 1–15 years (91%); 16–30 years (~ 8%); and > 30 years (~ 1%).

Fig. 10
figure 10

True age and longevity of seed collections of globally threatened plants conserved at the MSB, RBG Kew: a true age of collections from the year seeds were harvested to current; and b number of years seeds in − 20 °C long-term storage

Only 78 collections (nine gymnosperms and 69 angiosperms) were retested over time (1st retest, 2nd retest, 3rd retest, etc.): 51 collections up to 1st; 19 up to 2nd; six up to 3rd; and two up to 4th. These collections are represented by 31 families, 47 genera and 52 taxa. For these collections the period between initial and the most recent retest ranged from three to 36 years: 1–5 years (6%); 6–10 years (41%); 11–15 years (21%); 16–20 years (18%); and > 20 years (14%).

On average, retested collections achieved 81% \({\text{RG}}\) at initial test, which decreased to 74% at most recent retest. About 80% of collections achieved \({\text{RG}}\) > 70% for their initial test but the percentage of collections decreased to ~ 65% in retests and conversely, 16% of collections achieved \({\text{RG}}\) ≤ 50% for their initial tests but the percentage of collections increased up to ~ 24% in retests (Fig. 11).

Fig. 11
figure 11

Relative germination percentages (RG) of seed collections of globally threatened plants analysed for longevity during storage at the MSB, RBG Kew: percentage of collections falling under different size classes of RG during initial and most recent retest are shown separately

The comparison of \({\text{RG}}\) between initial and the most recent retest revealed that it remained the same for 20 collections (\({\text{RG}}\) was 100% for 14 collections, 98% for two collections, 14% for one collection and 0% for three collections), increased between 2 and 56% for 22 collections (significant for four collections with Z > 1.96 and P ≤ 0.05) and decreased between 1 and 88% for 36 collections (significant for 13 collections with Z > 1.96 and P ≤ 0.05). Therefore, decline in germinability during variable time of storage was evident for 16% of the 78 collections analysed for longevity. There was no apparent correlation between the number of years that seeds were being stored and the change in RG during this period (Fig. 12: Sample Pearson correlation coefficient (r) = 0.21).

Fig. 12
figure 12

Correlation between the number of years that seeds have been stored in − 20 °C long-term storage at the MSB and the change in the relative germination percentage (RG) during this period (difference in the highest RG achieved for the most recent retest when compared to that of initial round of germination tests after storage). To handle both negative (decrease) and positive (increase) values for change in RG, prior to log transformation a constant value of 100 is added to all data. Sample Pearson correlation coefficient on semi logarithmic transformed data (r) = 0.21

Seed germination protocols

The best germination protocols, with at least 70% \({\text{RG}}\), identified from post-storage initial germination tests for 165 taxa from 50 families are given in Appendix 2. The germination protocols of a further 19 taxa with at least 70% \({\text{RG}}\) were excluded due to the low number of true seeds sown (with embryos). The incubation temperature used for initial tests was either constant (77%), alternate (19%) or a combination of constant and alternate (4%). Overall, 43% of post-storage initial tests were set up with a dormancy breaking condition and/or treatment (see Appendix 1). Single temperatures suitable for the germination of non-dormant seeds were applied in 89% of tests (medium temperatures 57%; high temperatures 31%; and low temperatures 1%). Temperatures suitable for breaking seed dormancy were applied to 11% of tests (cold stratification 8%; move-along three or more different temperatures 2%; combined stratification 0.7%; and warm stratification 0.3%). About 32% of tests were setup with one or more dormancy breaking treatment (scarification 10%; surgical 8%; use of gibberellic acid in germination medium 7%; mechanical manipulation 6%; and application of after ripening, use of nitric acid in germination medium or soaking seeds in smoke solution < 1% each). Non-dormancy breaking methods such as use of 24 h of dark photoperiod and/or anaerobic conditions were limited to < 1% of tests.

Discussion

Pre-MSB seed collecting expeditions for the Kew seed bank focused on the UK and Europe, especially the Mediterranean region. The MSB Project (2000–2010) conservation effort was initially focused on drylands as species adapted to hot, dry environments may have evolved longer lifespans in the dry state and many produce orthodox seeds and so are suited to conservation in seed banks (Li and Pritchard 2009). The world’s drylands are also home to an immense variety of plant life, support approximately one-fifth of the world’s population (far more than the tropical rain forests), as well as 50% of the world’s livestock, and provide forage for both domestic animals and wildlife (van Slageren 2003). Drylands are identified as among the most threatened environments on Earth, with large areas being lost due to desertification each year (van Slageren 2003). The breadth and depth of species coverage and their genetic diversity conserved at the MSB are likely influenced by the nature of different funding models and objectives of different conservation projects.

The collections reviewed originated from a wide geographic range. Although, given the overall collecting bias and the consequent expectation that many species collected would be from habitats with a more or less arid climate and limited seasonal precipitation, in fact the majority of threatened species have been collected from temperate climates from habitats with no dry seasons but experiencing warm summer periods. As expected, our sample included only a few species that originated from tropical habitats, as the majority of species from these habitats bear seeds that could be extremely short lived or recalcitrant, with cryopreservation possibly the only means to ensure their effective ex situ seed conservation (Li and Pritchard 2009; Hay and Probert 2013).

The taxonomic composition within the sample highlighted a substantial diversity. As families with a high incidence of recalcitrant species are less likely to be conserved in conventional seed banks, we did not expect many collections and/or taxa from families such as (in brackets, total number of globally threatened taxa listed in IUCN (2017) followed by number globally threatened collections and taxa conserved at the MSB): Fagaceae (63, 0, 0); Lauraceae (223, 0, 0); Sapotaceae (252, 0, 0); Moraceae (59, 0, 0); Clusiaceae or Guttiferae (126, 0, 0); Sapindaceae (119, 1, 1) including Aceraceae (0, 0, 0); Arecaceae or Palmae (339, 7, 5); Myrtaceae (299, 0, 0); Annonaceae (193, 0, 0); Rutaceae (137, 3, 3); Anarcardiaceae (94, 11, 1); Dipterocarpaceae (407, 0, 0); Meliaceae (155, 8, 7) and Rhizophoraceae (13, 0, 0) (Dickie and Pritchard 2002).

Almost four fifths of the collections (79%) were made directly from the wild, with the assumption that they represented greater genetic resource than those from a cultivated gene pool. Nevertheless, by comparison with overall MSB holdings, which consist of almost 92% of collections with a wild genetic heritage (Liu et al. 2018), the threatened species collections appear to have relied rather more on non-wild sources. Furthermore, the majority of collections and taxa are likely to suffer from low genetic diversity, as a low number of individual plants (< 50), different populations (< 5) and/or potentially viable or usable seeds (< 5000) were sampled at the original harvest. A large proportion of empty and infested seeds in the original harvest significantly affected the quality of collections in terms of availability of viable or usable seeds. As a result, just over one third of taxa and one fifth of collections consisted of ≥ 5000 potentially viable or usable seeds and the majority were below the recommended threshold. It should be noted that for 61% of collections reviewed, only a portion of the original harvest is conserved at the MSB. Although, conserving the original harvest at multiple locations increases security for the germplasm, the total conservation effort could be undermined due to low seed numbers stored at individual locations.

Difficulties in meeting the demands for genetic diversity and seed quantity are reflected in seed collections of threatened plants conserved at the MSB and worldwide. For example, about 34% of overall MSB holdings are represented by > 5000 potentially viable or usable seeds (Liu et al. 2018), but this estimate reduces to 21% for threatened collections. Of the European threatened flora conserved among the seed banks of European Native Seed Conservation Network, only one third of taxa are represented by at least five collections, and only 23–28% of the species are represented by collections with ≥ 5000 seeds (Godefroid et al. 2011; Rivière and Müller 2017; Rivière et al. 2018). About 71% of the collections conserved for threatened species in the Australian PlantBank had < 1000 seeds (Offord et al. 2004), while 50% of the collections conserved for critically endangered taxa in the Western Australian Threatened Flora Seed Centre (WA TFSC) consisted of < 1000 seeds (Cochrane et al. 2007). Threatened taxa are often rare with fewer extant populations or have low seed production. It does, however, mean that more needs to be done to ensure the genetic coverage of ex situ conservation collections of these taxa.

Although incorporation of spatial distribution patterns and population structures to sampling strategies are recommended, at least for poorly connected or sparsely distributed plant taxa, common protocols for ex situ collections of seeds have not usually considered these characteristics (Hoban and Schlarbaum 2014). Due to lack of data on spatial distribution patterns of populations and population structure, we are unable to differentiate whether the assumed reduced genetic diversity is related to sampling strategy or species characteristics such as breeding system. Therefore, we suggest integrating such characteristics with future seed collection activities. For example, RBG, Kew launched the UK National Tree Seed Project (UKNTSP) in 2013 as an ex situ seed conservation initiative adopting a well-designed sampling strategy based on various social and ecological factors including the geographic patterns of targeted species in biogeographic zones (Kallow and Trivedi 2016). This is specifically to capture genetic variation within and among populations in order to protect against the loss of genetic variation from threats, including pests and diseases and climate change. The germplasm of British populations of European yew (Taxus baccata) conserved by this project at the MSB was found to be a representative of wild populations in terms of allelic capture, including rare and locally common variants, indicating that the sampling protocol applied is appropriate (Gargiulo et al. 2019).

Despite the importance of and need for implementing the correct sampling strategy, limitations to the ability to capture wild genetic diversity remain. This is primarily because the seed ecology of most wild plant species including their distribution patterns, breeding biology and population genetics remain unknown (Mortlock 2000; Merritt and Dixon 2011; Hay and Probert 2013; Teixido et al. 2017). Furthermore, the availability of funding limits the capacity for comprehensive conservation programmes to sample and conserve all suitable populations. Also, many of the perceived shortcomings of seed banking of wild plants may arise from the heterogeneity of wild plant populations. Wild plant populations tend to be heterogeneous in distribution patterns, genetic integrity, flowering and fruiting seasons (phenology), production of viable seeds and seed maturity. This impacts on the availability of mature and viable seeds for ex situ conservation and consequently on seed germinability, vigour and longevity in seed banks. Seed collection protocols have been developed to ensure the survival of the natural populations is not threatened e.g. by collecting no more than 20% of the mature seeds available on the day of collection (Way 2003), or 5% of the reproductive material on each plant for threatened species (Offord et al. 2004). As the taxa that are identified as globally threatened are often characterised by small and fragmented populations, collection sizes are necessarily small. To increase the overall seed resource, multi-year sampling is often used, but individual collections tend to consist of small seed quantities (Cochrane et al. 2007). Regeneration of seeds in purpose-built facilities is an alternative method to increase the amount of seed of a species required for in situ conservation needs, this is, however, beyond the scope of most wild species seed banks.

Some taxa appear to produce naturally critically high proportions of non-viable seeds, though in most cases the relative contribution of genetic and environmental factors to this is unknown. Thus, they have a very low potential for regeneration from seed, necessitating specific in situ and ex situ conservation strategies to avoid biodiversity loss (Godefroid et al. 2011). For example, Dayrell et al. (2016) found in the megadiverse heterogeneous grasslands of the campo rupestre, that at least half of the seeds produced by 46% of the 83 populations consisted of different species were embryo-less and/or non-viable, suggesting phylogeny is related to seed viability percentages. Utah juniper (Juniperus osteosperma) is one of many plant species that produce large numbers of fruits containing parthenocarpic or otherwise empty or unviable seeds (Fuentes and Schupp 1998). Poor germination of seeds is a common occurrence in the Umbelliferae (Apiaceae), the most probable cause being the presence of non-viable seeds with no embryo as a result of the Lygus bug feeding on developing seeds (Robinson 1954). Plant families such as Asteraceae (Compositae), Cyperaceae and Poaceae are known to produce many empty seeds while Fabaceae (Leguminosae) seeds often suffer insect damage (Way and Gold 2008). As a result, achieving ideal seed quantities is often impossible when conserving wild plants of threatened taxa. About 21% of the collections reviewed belonged to plant families that are mentioned above. We identified four families (including Asteraceae or Compositae and Apiaceae stated above), three genera and four species which are likely to produce 20% or more unusable seeds. Legumes seem to have a low quantity (7%) of unusable seeds in their collections, however, estimates were made after seed cleaning, and as legumes have relatively larger seeds, empty and infested seeds are more likely to be identified and removed during the cleaning process. Our sample included collections from the families Cyperaceae and Poaceae, but the number of genera presented from each family was insufficient to derive any conclusion.

The propagation of plants from seeds is a viable, inexpensive and generally effective method in conservation activities (Cerabolini et al. 2004; Cirak 2007). Viable seeds in the collections reviewed exhibited a sound physiological status in terms of germinability, viability and vigour at initial round of germination tests after long-term storage. We identified suitable germination protocols for 165 taxa from tests where the number of true seeds sown (with embryos) was > 9 and relative germination was at least 70%.

The percentages of collections achieving > 80% relative germination (57%) and viability (66%) were respectively 4% and 10% less than the estimates reported for all MSB holdings in Liu et al. (2018). On average, relative germination percentages achieved by collection (RG = 70%) and taxa (RG = 67%) were 15–18% below the expected 85% germinability and 8–11% above the average estimated for threatened species of the Belgian flora (RG = 59%), conserved in in long-term storage at the National Botanic Gardens of Belgium, where 59% of collections require regeneration or recollection (Godefroid et al. 2010). The low germination percentage reported for the Belgian flora is related to mouldiness of seeds during the germination process and the authors reported that reducing fungal proliferation by surface-sterilisation of the seeds could improve results.

About 4% of the reviewed collections (5% of taxa) showed both 0% \({\text{RG}}\) and \({\text{RV}}\). As \({\text{RV}}\) was an estimate calculated using cut-test results of non-germinated seeds, collections with both 0% RG and \({\text{RV }}\) will be further assessed using additional germination test conditions and/or treatments or tri-phenyl tetrazolium chloride stain (see Appendix 1).

Germinability alone does not reflect the viability of a collection. The 85% germinability threshold, an estimate derived from agricultural settings, is a very high threshold for wild species. Even in their natural habitats some taxa may not achieve germination percentage as high as 85%. Viable seeds may also fail to germinate because of quiescence and/or dormancy. Quiescence is a state of suspended growth of the embryo of non-dormant seeds when minimum requirements for germination are lacking e.g. water, temperature, gasses, and light. Dormancy, a state in which seeds are prevented from germinating even under environmental conditions normally favorable for germination, is determined by morphological and physiological properties of the seed. Both quiescence and dormancy must be relieved for seeds to germinate and establish seedlings. Dormancy may be a main determinant of a species’ distribution, ensuring germination occurs under appropriate seasonal conditions, thereby reducing extinction risk and providing the opportunity for subsequent adaptive divergence (Willis et al. 2014). However, an inability to break seed dormancy is an obstacle preventing dormant but healthy seeds from germinating (Merritt and Dixon 2011). About 28% of seed collections belonging to threatened Belgian flora appeared to exhibit some degree of dormancy, with the majority being non-dormant (Godefroid et al. 2010). The authors highlighted that while dry seed storage may induce secondary dormancy for some species it may also break dormancy through after-ripening process for other species (especially non-deep physiological dormancy). Examining seed dormancy is beyond the scope of our review but at the MSB, seed germination protocols are standardised to overcome the most common seed dormancies (Appendix 1).

Decline in germinability was evident for at least 16% of collections during variable times of storage. A previous analysis of germination data for collections stored at the MSB for over 20 years, has shown no significant reduction in viability during this period in 86% of 2388 collections concluding that seed drying (15% Relative Humidity and 15 °C) and storage (− 20 °C) conditions used at the MSB are suitable for long-term storage of orthodox seeds (Probert 2003). Most of the seed collections held by the WA TFSC in short (< 5 years) and medium (5–12 years) term storage maintained their viability and declines in germination were only evident for a small number of collections, representing 10 taxa, stored in medium term conditions (Crawford et al. 2007). Many of the above declines were collection-specific and not reproduced by other collections of the same taxon. In the Australian PlantBank, seed viability assessments for the collections belonging to threatened species indicated that nearly 56% of collections older than two years had a viability > 80% indicating that the remaining 44% of the accessions required recollecting (Offord et al. 2004).

An analysis of longevity for about 42,000 orthodox seed accessions (Walters et al. 2005) highlighted that some plant families had characteristically short-lived (e.g. Apiaceae and Brassicaceae) or long-lived (e.g. Malvaceae and Chenopodiaceae) seeds, and seeds from species originating from particular localities had characteristically short (e.g. Europe) or long (e.g. South Asia and Australia) shelf lives. The effect of seed traits and environmental conditions at the site of collection on seed longevity was explored for 195 species stored at the MSB by ageing seeds at elevated temperatures and relative humidity (Probert et al. 2009). Although the causes for seed death in seed banks and rapid-ageing conditions may not be the same, Probert et al. (2009) suggested that the apparent short-lived nature of endospermic seeds from cool wet environments may have implication for re-collection and re-testing strategies in ex situ conservation. There is evidence that some species produce orthodox seeds of short longevity in dry storage (Walters et al. 2005). Understanding species’ differences in longevity is crucial for the effective management of collections in seed banks because it underpins the selection of viability monitoring periods and hence regeneration or re-collection strategies (Probert et al. 2009).

Wild-species seed banks have been criticized for having seed holdings that are insufficient to provide for the needs of large-scale restoration projects and for lacking documentation or knowledge of seed quality measures (e.g. germinability, viability and vigour), seed dormancy breaking procedures and germination protocols. The purpose of wild-species seed banks is to preserve enough genetic diversity to prevent species and populations from extinction. Restoration guidelines strongly recommend using local seed sources to maximize local adaptation and prevent outbreeding depression, but there are situations where highly modified landscapes restrict the choice of seed sampling areas to small remnants where limited, poor quality seeds are available, and where harvesting impacts may be high (Broadhurst et al. 2008). Wild-species seed banks are not a suitable seed source for large-scale restoration and species reintroduction programs unless seeds are sampled specifically for this purpose. Our review goes some way to answer this criticism and to increase understanding of the extent to which these concerns are valid for globally threatened seed collections conserved ex situ at the MSB.

It is hard to determine specific standards for the conservation of seeds from wild plant species, and most of the theory is derived from studies on crops where plants have been bred to have higher and more uniform seed production and germination (Hay and Probert 2013). Given the apparent differences between wild species, especially rare and threatened, and domesticated crops, the quality and physiological status of reviewed collections are reasonably sound. The seed conservation protocols used at the MSB (for acquisition, drying, cleaning, storage, viability monitoring, regeneration, propagation, duplication, distribution, documentation, etc.) follow international genebank standards (FAO/IPGRI 1994; FAO 2013) as well as RBG Kew’s own experience in seed banking for over 50 years. As a result, the MSB produced its own seed conservation standards for use with seed banking of wild species. These are widely used across the MSB Partnership (MSBP) to ensure that high quality material is stored throughout the MSBP (Breman and Way 2018). In addition, technical information sheets produced by MSB, covering various aspects of seed conservation practice are published at https://www.kew.org/science/collections/seed-collection/millennium-seed-bank-resources.

These protocols are established to ensure that viable and mature seeds are collected, collections are sufficiently dried, processed and stored according to gene bank standards, the quality of seeds is assessed using several parameters, viability and longevity of seeds are monitored routinely through germination tests, and seed dormancy issues are handled by applying standardised seed-dormancy breaking protocols (see Appendix 1). Over time, adjustments for these protocols are necessary to address challenges in meeting the demands for genetic diversity, quality and germinability. Research is needed to investigate the causes for low genetic diversity or quality (e.g. population, collection or taxon specific, seed trait, climate at origin, etc.) and to develop germination protocols for taxa that achieved relative germination below 70%. It would also be interesting to undertake a similar review across the MSBP to help plan future conservation activities. For now, this review provides a comprehensive background to underpin future research in seed ecology and germination protocols for propagation of plants from seeds in various conservation activities. A summary of the review is also provided as supplementary material.

Implications for practice

Our review was based on a comprehensive and empirical dataset spanning over 45 years of worldwide conservation effort across various organisations. The data provides evidence-based results and future directions for the represented globally threatened flora that are of relevance to all plant conservation and seed banking organisations across the globe. The importance of these collections in the face of threats to global plant diversity cannot be overstressed. The characteristics we observed for collections (geographic and bioclimatic representation, taxonomic and genetic diversity and physiological status), the challenges we identified for conserving them and germination protocols we suggested for propagation of plants from seeds have the scope to be noted, integrated and used globally across various conservation activities and policies.

Crop germplasm has been conserved in seed banks for over 60 years to preserve crop diversity useful to future agriculture. Wild species seed banks are adapting this technology with a relatively short period of experience, whilst working with more variable and unknown parameters associated with wild species and gradually creating their own best practice. Capturing and maintaining accurate and comprehensive data from the point of harvesting seeds from habitats and during their life cycle in seed banks is essential for effective management of wild species, through understanding their geographic and bioclimatic origin, taxonomic identity, quality and physiological status.

There is no single seed sampling strategy which will equally capture genetic diversity of different wild plant species. However, representation of wild genetic diversity can be improved by adjusting individual sampling events according to spatial distribution pattern, population structure and reproductive strategy of the targeted species. It should be noted that seed ecology of many wild plants is unknown, and some species naturally fail to germinate successfully in their natural habitats or produce a large proportion of empty or embryoless seeds. These characteristics may be genetically bound and inherited at various taxonomic levels or influenced by the natural surroundings. To understand such correlations, it is also important to differentiate quantities of potentially usable and unusable seeds according to collections and subsequently under families, genera, species, etc. Establishing standardized seed germination protocols improves the viability assessment process and well-designed germination data recording facilitates effective data analysis to identify suitable germination protocols and to monitor viability and longevity of collections. There is a need for local, national and global conservation policy makers and professionals to focus on the quality and physiological status of germplasm conserved ex situ, and to fund comprehensive and extended seed sampling activities to capture local, national, regional and global genetic diversity. We must close the gap between seed ecology and seed banking for wild-origin collections, especially in relation to seed storage behavior and suitable measures to conserve short-lived and recalcitrant seeds.