Introduction

Plant gene banks that preserve seeds monitor the viability of their seed collections to ensure that accessions remain sufficiently viable. To this end, periodic germination tests are performed to determine whether stored seed lots still meet viability thresholds. When seed quality reaches below-threshold values, the seeds are replaced by regenerating the accession to produce a new generation of fresh seeds. The decision as to whether an accession needs to be regenerated or can be kept in storage is not without consequence. Apart from costs and labor associated with testing and regeneration, regeneration itself poses a considerable risk to the genetic integrity of accessions (Rao et al. 2006; FAO 2014).

Many accessions that are held in genebanks, such as landraces and samples of wild populations, are often heterogeneous, which implies that potentially valuable genetic variation is dispersed among different individual seeds in the population. The regeneration of heterogeneous accessions can be expected to change the genetic composition of that accession. Inadvertent selection of specific genotypes is one potential cause. For example, specific genotypes may be selected for or against, depending on variation in seed dormancy, flowering time, fertility, seed-set, responses to local growth conditions or post-harvest processing (e.g. seed cleaning) (Finch-Savage and Leubner-Metzger 2006; Austen et al. 2015; Schneider 2022; Larios et al. 2023). Other threats to an accessions’ genetic integrity include genetic drift and inbreeding, and the possibility that accessions become contaminated, for example during seed handling or as a result of outcrossing with another accession during regeneration. Even though dedicated measures are taken to mitigate such risks, unnecessary regenerations are to be avoided as much as possible to preserve the genetic variation present within accessions (Sackville Hamilton and Chorlton 1997).

Nevertheless, the adverse effects of regeneration will—at a certain point—no longer outweigh the risk of losing a specific accession due to ageing. The deterioration of seeds over time has been a subject of study, and many factors affect the speed at which seed lots lose their viability. On the one hand seed lots are genetically distinct, and differences in seed survival have been described at species-, accession- and genotypic level (Arif et al. 2022). As natural variation for seed survival exists, selection may occur among seeds in heterogeneous accessions, and can also lead to genetic drift when only few plants remain for regeneration. On the other hand, the speed of deterioration depends on environmental conditions like humidity, temperature, oxygen levels or post-harvest processing as reviewed in (Solberg et al. 2020; Whitehouse et al. 2020). Experiments on seed ageing have shown that seed viability curves (that typically run from 100 to 0% viability) have a sigmoidal shape, in which the decline in percentage viability varies over time (Ellis and Roberts 1980; Mead and Gray 1999). The fact that there is little experience with seed lots that have been in cold (− 20 °C) storage for longer times adds to further uncertainty of determining the point at which seed lots are to be regenerated (van Treuren et al. 2013).

The decision regarding whether an accession needs to be regenerated is the outcome of a procedure for seed viability monitoring. Such a workflow typically consists of the execution of standardized viability tests that are to be executed at predefined intervals. The decision to regenerate (or not) is guided by the use of specified threshold values, after which seed lots are either regenerated or retained in the collection. The Centre for Genetic Resources, the Netherlands (CGN) maintains a plant seed genebank with over 23,000 accessions of orthodox seeded crops and their wild relatives. Periodic testing of the viability of CGN genebank accessions is part of its standard procedures, as documented in an ISO 9001:2015-certified quality management system. In recent years, the CGN has reviewed various elements of its workflow for viability testing, which resulted in the implementation of a new workflow. Before presenting this workflow, we will discuss a number of elements that were subject to change and we describe the reasons that led us to the implementation of those changes. Thereafter, we will describe the current workflow of seed viability monitoring at the CGN, and present experimental data to determine the quality and consistency of our new protocol.

Considerations when developing a workflow for seed viability monitoring

Seed viability testing: sample size and threshold values

The ‘Manual of seed handling in genebanks’ (Rao et al. 2006) defines seed viability as “the measure of how many seeds in a lot are alive and could develop into plants that will reproduce under appropriate field conditions”. Since the growth of seedlings into reproducing plants is practically unfeasible, the FAO Genebank Standards (FAO 2014) propose to measure the seed viability in terms of germination values (i.e. percentages). The FAO provides guidelines for minimum germination percentages for seed lots in genebanks. Standard 4.3.2 states that “The initial germination value should exceed 85 percent for most seeds of cultivated crop species.” and 4.3.4 “The viability threshold for regeneration […] should be 85 percent […] of initial viability.” Regarding the sample sizes that should be used to determine these values, the standards are not very explicit: “Sample sizes for viability monitoring […] should be maximized to achieve statistical certainty. However, the sample size should be minimized to avoid wasting seed. Seed in a genebank is a valuable resource and should not be wasted.”

Sample size is an important issue, because the reliability of a germination test result depends directly on the sample size. If the true germination value of a sample is 80%, and only 50 seeds are tested, there is a 19.0% probability (as derived from the binomial distribution) that a test will result in values over 85% and thus result in the incorrect conclusion that the seed lot has sufficient viability. But even if 200 seeds were used, there is a 2.8% probability that a test indicates sufficient viability despite a true below-threshold germination. Sample sizes also affect the probability of erroneously concluding for insufficient viability. For example, if the true germination value would be 90%, a sample size of 50 or 200 seeds would result in estimates below 85% with a probability of 12.2 and 1.6%, respectively. An excellent discussion of the many problems related to germination testing procedures, including relevant literature, was given by (Hay and Whitehouse 2017). High reliability comes at high cost in terms of required seeds and labor. Consequently, most genebanks in practice test between 50 and 100 seeds per accession to assess seed viability (CGN internal (unpublished) report by van Dooijeweert and Menting 2018).

Assessing whether seed viability has fallen to a certain percentage of initial viability (as proposed by FAO) is even more challenging, as both the initial viability (typically measured many years earlier) and the newly assessed seed viability both have associated uncertainties. To obtain the statistical power to detect a significant viability decreases requires numbers of seeds that are an order of magnitude larger (i.e. hundreds to over 1000 seeds per assessment) than commonly used for viability testing (Ellis et al. 1985). As a consequence, genebanks (including the CGN) typically use fixed germination thresholds to decide whether regeneration is required. A survey among seven genebanks showed that germination thresholds varied from 70 to 90% for cultivated species and from 50 to 85% for wild species (CGN internal report by van Dooijeweert and Menting 2018).

It is relevant to note that setting thresholds for germination values has little meaning without indicating the required reliability of the estimates of these values. The statement that the germination value “must be at least 85%” needs to be accompanied by the number of seeds that needs to be tested. For example, testing 200 seeds with a threshold of 85% (i.e. a minimum of 170 out of 200 seeds must germinate for acceptance of the batch) could be interpreted as a required reliability of 95% that the true germination is above 80%. A similar reliability can be achieved when adopting the threshold of 87% when using 100 seeds, 90% with 50 seeds or 100% with a minimum of 14 seeds. But alternatively, the 85% germination threshold based on 200 seeds could also mean that the reliability of the germination being above 82.5% is 80%, requiring other thresholds when using fewer seeds. This is ambiguous and confusing. Therefore, if germination values are used as a standard for seed viability, as in the FAO Genebank Standards, the thresholds should be defined as the minimum probability of the true germination being above a given threshold. These thresholds can be presented in terms of minimum numbers of viable seeds given the number of seeds tested.

Historically, the thresholds applied by CGN were 80% for cultivated- and 60% for wild species based on a sample of 200 seeds. These thresholds are re-interpreted here as 95% probability that the true germination is above 75 and 54%, respectively. As part of the new protocol, the CGN provides a table with these numbers to support its curators (Table 1). These values in turn are translated into minimum numbers of seeds germinating given the numbers of seeds tested.

Table 1 Threshold values used at the Centre for Genetic Resources, the Netherlands (CGN). The table presents the required minimum number of normal seedlings (and corresponding percentages) for reaching CGN’s minimal standards for germination. This standard is a probability of 95% (derived from the binomial distribution) that the true germination is above 75% (cultivated) and 54% (wild material)

The assignment of seed lot viability scores

Seed testing of commercial seed batches follows the rules of the International Seed Testing Association (ISTA 2023). ISTA provides detailed instructions for performing germination tests regarding number of seeds, substrate, light, temperature and pre-treatment for breaking dormancy, etc. During these tests the accessions are scored for numbers of normal and abnormal seedlings, the latter of which “do not show the potential for continued development into satisfactory plants when grown in good quality soil and under favourable conditions of moisture, temperature and light” and should not be considered to have germinated. The definition of well-defined criteria to identify normal seedlings is pivotal to determine the germination value as discussed above.

For a long time, the CGN has been using the services of an external ISTA accredited seed-testing laboratory for the germination testing of its genebank accessions. An analysis of the blind re-testing of accessions, performed annually on 5–10% of the tested samples, showed that repeatability was low. Significant differences in seed viability after a retest were found to be over threefold higher than expected if variation between tests was caused by sampling variation only (van Hintum and van Treuren 2012). Symptoms of senescence, due to the aging of seeds, are generally not observed in commercial seed batches but are sometimes present among genebank accessions that have been in cold storage (− 20 °C) for long periods. Such signs of senescence include chlorosis, poor root development, necrotic lesions and/or lower vigor, of which the impact on plant viability is unclear. It was hypothesized that the high variation among duplicated tests may lie in the difficulty to assess seedling defects in seed samples of low quality due to ageing. Moreover, seedbank accessions are often genetically heterogeneous, which may add further phenotypic variation to germinating seed lots.

Consequently, the CGN started seed viability testing in-house in 2016 with the intent of gaining a better control over- and insight into the assessment of seed quality of its seed lots. ISTA guidelines often served as starting guidelines that were amended and refined by our experts to derive criteria for separating normal from abnormal seedlings that better suit our genebank context. For example, to accommodate for intra-accession variation, the CGN by default now extends the duration of its viability tests (i.e. from start to “final count”) to the maximum number of days as suggested by ISTA guidelines. In spite of such alterations, it remains a continuous challenge to interpret variation between seedlings based on a minimum of information (i.e. because we use small sample sizes and only limited information on favorable germination conditions is available for tested accessions). Too strict criteria may cause unnecessary regenerations while too lenient criteria may cause the loss of accessions. To accommodate these uncertainties, the CGN has implemented a new classification method in its workflow that allows seed lots to be marked as showing signs of ‘little concern’. Such seed batches, typically close to threshold values, can now be retained in the genebank, but will be scheduled for earlier retesting than seed lots that are well above thresholds. This flexibility on setting future testing intervals allows our staff to take decisions with reduced risk (FAO 2014).

Genebank curators and technicians often possess a wealth of experience and crop-specific understanding that allows them to make judgement calls when faced with the many uncertainties that revolve around seed testing. Their ability to draw from a reservoir of past scenarios and apply nuanced expertise enables them to make well-informed decisions and assess risks, meaning that they are the best decision makers when it comes to seed quality. Based on the outcome of our seed viability tests, our experts are asked to provide an indication of the status of the seeds on the scale of ‘no concern–little concern–concern–high concern’, corresponding to the four states as defined in Table 2. The method used for this determination differs for each crop and was developed by technical staff and curators, the latter having responsibility for assignments and subsequent procedures.

Table 2 Descriptions for state assignments for tested accessions and associated actions after seed viability tests

Testing can comprise classical germination tests in Petri dishes with controlled light and temperature conditions, after which state assignments follow from Table 2. But when a first test causes (high-) concern and the germination threshold is not met, a second test is required to verify whether this is due to seed deterioration or due to other reasons (like inadequate germination conditions). Such retests can follow the same procedure, or may use a different protocol like the inclusion of a disinfection step, the use of specific germination stimulants or other amendments to the testing procedure. For example, while seeds of many potato accessions will germinate well on Petri-dish plates (a practical, cheap assessment), particular accessions develop very poorly under such conditions. If a first test shows below-threshold germination, the seed lot is inspected to determine the possible cause by -for example- dissection of non-germinated seeds. It may be chosen to perform more laborious and expensive greenhouse tests on those accessions. If the retested seeds show sufficient viability, they could be assigned state 1 or 2, if not 3 or 4. Figure 1 shows examples of germinating seed lots with the states they were assigned.

Fig. 1
figure 1

Details of germination tests illustrating symptoms of senescence due to seed aging. The upper panel shows highly contrasting germination results for lettuce (Lactuca sativa L.). The left picture a shows only normal seedlings (state 1), whereas in the right picture b nearly all seedlings have deformed cotyledons with physiological necrosis due to ageing (this seed lot was assigned state 4). The lower panel shows contrasting germination results for melon (Cucumis melo L.). The left picture c shows two normal seedlings (state 1) and the right picture d shows seedlings in which the primary root failed to develop (a sign of ageing)

Curators determine the procedure based on past experience with the crop or species, as long as they can reliably decide between the four different states. The state is recorded in the genebank documentation system, together with the date of testing, the testing method, results of germination percentage and the percentage of dormant seeds. The ‘testing method’ is recorded as a number that refers to a germination protocol defining the germination conditions (for example the use of a Petri-dish or greenhouse, temperature, daylength and the number of seeds tested).

Workflow and seed viability monitoring intervals

The test frequency is another important component of a workflow for monitoring seed viability. The FAO Genebank Standard 4.3.3 states that “Viability monitoring test intervals should be set at one-third of the time predicted for viability to fall to 85 percent of initial viability or lower depending on the species or specific accessions, but no longer than 40 years. If this deterioration period cannot be estimated and accessions are being held in long-term storage at –18 °C in hermetically closed containers, the interval should be ten years for species expected to be long-lived and five years or less for species expected to be short-lived” (FAO 2014). It adds a footnote with a link to an online application based on the Ellis and Roberts viability equations allowing a genebank curator to predict the lifespan of an accession (Society for Ecological Restoration 2023).

At the CGN all accessions that are to enter the collection are to be tested in an initial seed viability test, using a sample of 50 seeds. If the number of acquired seeds is below (crop-specific) guidelines, accessions are regenerated before testing. Acquired seed lots are stored in the genebank if the initial seed viability is at least 84 or 66% for cultivated or wild material, respectively (Table 1). The CGN strives to include new seed lots in our collection that are well above threshold values. If the germination percentage during the initial seed viability test is close to, or slightly below our threshold value (Table 1), seed cleaning followed by retesting may be used to improve the germination percentage of seed lots. Incidentally (< 1%), accessions of state 2 are included in our collection, for example when an accession -also after a second regeneration cycle- shows a germination percentage that is close to our threshold value. Seed lots of stages 3 and 4 are required to be regenerated again. CGN seed lots are dried to 3–7% moisture content and stored at − 20 °C under near vacuum in aluminium foil bags. An historical analysis of the germination values that were obtained in seed viability monitoring tests indicated that 3.3% of tests showed below-threshold germination values after 25 years of storage (van Treuren et al. 2013). Therefore, the period to a first monitoring germination test was set to 25 years after regeneration. The accessions that enter the genebank with state 2 are scheduled for a first monitoring test after 5 years.

After adding new seed lots to the collection, periodic monitoring tests are performed to check seed viability. Seed lots that after a monitoring test are classified as state 1 will be stored for another 20 years in the case of small grains or 10 years for other crops. Accessions in state 2 are targeted for closer monitoring and the period to the next monitoring test is reduced to 5 years. Seed lots that are close to- or below the threshold will be immediately tested in a second monitoring germination test. After this second test, seed lots will be assigned a final state (1 to 4). Seed lots that are assigned to states 3 and 4 are to be regenerated within 5 years or as soon as possible, respectively.

A schematic representation of the CGN workflow, including the duration of monitoring intervals is shown in Fig. 2. Further details on the duration of monitoring intervals and actions that result from state assignments in initial- and monitoring viability tests are also summarized in Table 2. Since experimental data about seed viability during genebank storage are not yet available for very long storage periods, and also experience with interpreting the symptoms of senescence is largely absent (Whitehouse et al. 2020), monitoring intervals may be extended as more empirical data are collected.

Fig. 2
figure 2

Schematic representation of seed viability monitoring at the Centre for Genetic Resources, the Netherlands (CGN). Arrows indicate the flow of seeds from acquisition to cold storage (in the blue, rounded box), through an initial viability test. If only few seeds were initially acquired, accessions are regenerated first. After storage, seeds are tested in a monitoring viability test (bottom), after which seeds are either retained in cold storage or are regenerated. Circled numbers one through four refer to the status of seeds that is determined by CGN staff after testing. The state of seeds is based on the germination percentage derived from the viability test (c.f. Tables 1 and 2) and indicates levels of concern, ranging from ‘no concern’ (state 1) to ‘high concern’ (state 4). Seeds lots in state 2 are scheduled for an earlier monitoring test than seed lots of state 1. States 3 and 4 lead to accession regeneration, in which state 4 takes priority over state 3. The time spent in cold storage until further testing is indicated next to each arrow. Please note that both the initial- and monitoring seed viability tests in this diagram may consist of two subsequent tests if the germination percentage in the first viability test is below the required threshold value

Evaluation of seed monitoring

Seed viability tests at the CGN

After setting up viability testing in our own laboratories, the CGN has kept its policy of blind re-testing about 5% of tested samples in line with its previous practice of performing control experiments (Rao et al. 2006; van Hintum and van Treuren 2012; FAO 2014). Here, an evaluation of these first re-testing data is presented (covering the period 2018–2022). Figure 3 shows the (duplicated) germination test results for 180 seed lots. To test whether pairwise differences may result from sampling variation, a two-tailed χ2 test was performed, to single out test pairs in which germination percentages differ more than expected based on chance alone (α = 0.05). Samples that differ more than expected are shown as open circles and squares.

Fig. 3
figure 3

Germination control tests of 180 seed lots. P indicates the probability of the results by chance alone, while n represents the number of included pairs. For data points with P < 5%, a distinction is made for germination values (G) < 90% and values ≥ 90%

Sample pairs at the margins of the distribution range (i.e. with G ≥ 90%) may be labeled as ‘suspicious’ as a result of small differences in germination (open circles). Consequently, we focus our analysis on those samples in which the germination percentage is < 90% for either one of the samples (open squares) in line with (van Hintum and van Treuren 2012). Of those samples 11.1% (20 out of 180) can be identified as statistically different, where 5% would theoretically be expected.

Our current dataset (n = 180) of re-testing seed lots (in the CGN laboratory) is not yet large, and has various potential sources of experimental variation. One of these sources stems from the fact that in 54% of repeated germination experiments, the re-testing experiment was performed using a different germination protocol than the protocol that was used in the first assessment. In addition, re-testing experiments were typically performed one year after the first assessment. Since seedling assessment procedures and germination testing procedures were (and still are) under development at the CGN laboratory, assessment procedures may change over time. More details on potential sources of experimental variation are detailed in the Sect. “Materials and methods”.

Given that our dataset has various potential sources of experimental variation, it is interesting to note that the estimated percentage of ‘suspicious’ samples (11.1%) suggests an improvement with respect to our previous practice. When germination percentages were executed by ISTA accredited laboratories following standardized ISTA protocols, 18.1% of duplicated samples were classified as suspicious (van Hintum and van Treuren 2012).

State assignments at the CGN

The results of viability tests are typically variable (Fig. 3). Because the outcomes of germination tests guide decision making, it is to be expected that state assignments are variable too. Table 3 summarizes state assignments related to the samples in our re-testing experiments. This dataset comprises a subset of the accessions of which data were also shown in Fig. 3. Table 3 lists the two state-scores that CGN curators assigned to 161 seed samples that were re-tested for control purposes. It should be noted that both state-assignments were made after a single germination experiment (in both the first and the re-test experiment). This dataset is slightly smaller than the dataset of 180 test-pairs used to construct Fig. 3. We excluded observation pairs in which in either the first or second experiment more than one experiment was conducted to assign a state to a sample.

Table 3 State assignments in 161 accessions for which a state was assigned in two consecutive years, following a retest. State assignments were done based on one germination test

The decision to regenerate (or not) depends on the assignment of an accession to either state 1–2 or to state 3–4. Of the tested accessions, 84% (135 out of 161) were assigned a state 1 or 2 in both the first and second assessment. A state of 3 or 4 was assigned in both assessments to 5.6% of accessions (nine accessions; Table 3). This leaves a total of 17 cases (10.6%) for which the decision to regenerate was opposite between assessments. Among these 17 cases were accessions in which the viability assessments were close to threshold values, but also includes instances in which an accession likely responded very favorably to a different germination test protocol that was used in one of the germination experiments.

It is notable that for the given dataset, the number of accessions given opposite regeneration advice after two tests (n = 17) is twice as large as the number of accessions that was assigned a regeneration advice twice (n = 9). Though such numbers are highly dependent on the tested accessions, it suggests that the inclusion of a procedure to retest accessions that do not meet the seed viability threshold value after a first test, is expected to result in a revised regeneration decision after a second germination test for about half of the re-tested accessions. Even though this set of accessions is slightly biased for accessions with a score of 3 or 4 (see Materials and Methods), it nicely illustrates the ambiguity in decision making. For this reason, the CGN decided to retest every accession that is given a state 3 or 4 after a first seed viability assessment.

Discussion

Reliably assessing whether an accession needs to be regenerated or can be kept in storage is one of the most complex balancing acts in genebank practice. Obtaining high certainty with respect to seed viability comes at high costs in terms of labor and valuable seeds that are to be used in the viability tests. These trade-offs between costs and statistical reliability have been well described in the literature, but likewise it has been acknowledged that the formulation of practical guidelines is very difficult (Hay and Whitehouse 2017). Finding the right balance between the costs and making the right regeneration decision (with its potential for genetic loss) is the elephant that resides in the storage rooms and laboratories of most genebanks.

The CGN modified its workflow for seed viability testing in response to a number of observations. There were experimental data (van Hintum and van Treuren 2012; van Treuren et al. 2013) that suggested that procedures for viability testing and monitoring intervals required modifications. Setting-up testing procedures at the CGN laboratories prompted us to be critical regarding the criteria that are used to identify (ab) normally developing seedlings. As described, the assignment of ‘viability states’ to seed lots can accommodate a level of doubt and can result in a reduction of the time between subsequent viability tests. For many crops it was decided to reduce the number of tested seeds. In order to ensure consistency, we reformulated previously used germination thresholds into a minimum reliability of the true germination being above a given threshold. The current workflow is practical and to the best of our judgement, is an improvement as compared to our previous practice. Seed viability tests are more reproducible now than was previously the case. As methods are still being optimized, further improvement can be expected.

By acknowledging that best-practice rules and thresholds are very difficult to define, the new CGN protocol allows, in theory, for more lenience with respect to regeneration thresholds. Accessions that show germination percentages below suggested viability thresholds may now be kept in our collection as state 2 when a curator is of the opinion that a certain seed batch is not at risk of dying in the next 5 years. It can be observed from data in Table 3 that of the accessions that are assigned a state 2 in one of the assessments, about 2/3rd (22 out of 34 accessions) are assigned to state 1 in replicate experiments. This suggests that our curators may be relatively cautious with at least some assignments.

Strict application of regeneration thresholds as defined by FAO (2014) may lead to unnecessary regenerations but may also lead to false decisions not to regenerate. By allowing curators to be lenient with threshold values (which may result from judgment criteria for abnormal seedlings), the number of ‘wrong decisions’ can be reduced. It is possible for curators and staff to improve on their judgement calls over time, by recording various aspects of germinating seeds and seedlings and compare such notes with observations upon retesting. In addition, curators and staff can perform experiments to test whether changes in protocols may improve germination percentages. Other tests, like tetrazolium viability tests (França-Neto and Krzyzanowski 2019), might provide additional insightful answers on seed viability, but require time and resources. There is a chance of complicating decision making.

Genebank management that is aimed at keeping the accessions alive and available, obviously benefits from simplicity. Genebanks deal with very large numbers of germination test results, and in the end to make a reliable decision about which accessions to regenerate and which to store for a prolonged time period. By providing threshold guidelines (as defined in Table 1) it is clear to curators what the thresholds are and they can be followed. However, the four ‘viability states’ provide curators with the option to capture information about the urgency of regeneration and possibly the need for tighter monitoring. The new CGN protocol doesn’t solve the problem of distinguishing between normal and abnormal seedlings, but allows the curator to make an experience based judgement of the vigor and appearance of the seedlings, using the threshold percentages for support in that judgement.

The new protocol does not comply to the letter, but certainly does to the spirit of FAO standards: critically monitor seed viability to prevent the loss of precious genebank accessions.

Material and methods

Seed viability tests

The selection of seed lots for re-tests of seed viability was done as follows: from among the seed lots tested in a specific year, we randomly selected 10 accessions per species group that were used for re-testing in the following year. If no accessions of state 3 or 4 were among the randomly drawn accessions, the last selected accession was replaced by a randomly selected accession of state 3 or 4. Accessions for which seed stocks were low, or for which a regeneration was in progress because of high user demand, were excluded from re-testing.

The CGN has pre-packaged seed lots that are destined for future testing. Seed lots of accessions that were selected for re-testing were taken again from our storage facility, and thus seed bags that were selected for re-testing were not the same bags as used for the initial test. Re-tests were done blind to the technician performing these tests. In total 180 seed lots were tested, covering 35 different species. Since different protocols may be available for testing, 83 seed lots were subjected to the same testing method, whereas 97 seed lots were subjected to a different method. For 19 seed lots, two or more assessments were made either in the first or the second year. If multiple tests were performed during a single assessment, the highest recorded germination percentage between those tests was used as the germination percentage in Fig. 3. Test results are provided as an Online Resource.

State assignments

The samples for which a state assignment was based on several tests (either in the first year of testing, or in the second year of testing), were not included in Table 3 for state assignments.