The Design of Early-Stage Plant Breeding Trials Using Genetic Relatedness

The use of appropriate statistical methods has a key role in improving the accuracy of selection decisions in a plant breeding program. This is particularly important in the early stages of testing in which selections are based on data from a limited number of field trials that include large numbers of breeding lines with minimal replication. The method of analysis currently recommended for early-stage trials in Australia involves a linear mixed model that includes genetic relatedness via ancestral information: non-genetic effects that reflect the experimental design and a residual model that accommodates spatial dependence. Such analyses have been widely accepted as they have been found to produce accurate predictions of both additive and total genetic effects, the latter providing the basis for selection decisions. In this paper, we present the results of a case study of 34 early-stage trials to demonstrate this type of analysis and to reinforce the importance of including information on genetic relatedness. In addition to the application of a superior method of analysis, it is also critical to ensure the use of sound experimental designs. Recently, model-based designs have become popular in Australian plant breeding programs. Within this paradigm, the design search would ideally be based on a linear mixed model that matches, as closely as possible, the model used for analysis. Therefore, in this paper, we propose the use of models for design generation that include information on genetic relatedness and also include non-genetic and residual models based on the analysis of historic data for individual breeding programs. At present, the most commonly used design generation model omits genetic relatedness information and uses non-genetic and residual models that are supplied as default models in the associated software packages. The major reasons for this are that preexisting software is unacceptably slow for designs incorporating genetic relatedness and the accuracy gains resulting from the use of genetic relatedness have not been quantified. Both of these issues are addressed in the current paper. An updating scheme for calculating the optimality criterion in the design search is presented and is shown to afford prodigious computational savings. An in silico study that compares three types of design function across a range of ancillary treatments shows the gains in accuracy for the prediction of total genetic effects (and thence selection) achieved from model-based designs using genetic relatedness and program specific non-genetic and residual models. Supplementary materials accompanying this paper appear online.


INTRODUCTION
Plant breeding is focused on the objective of genetic improvement, producing new varieties with increased productivity and quality. Most plant breeding programs follow a method of breeding referred to as a pedigree selection method. This method implies that programs are structured around the grouping of breeding lines which have been derived as progeny of a fixed number of crosses between elite parents. Different crosses are made each year, and the cohort of breeding lines then undergoes selection through preliminary and advanced stages of testing.
Traits of interest for selection in the preliminary stage include disease and herbicide tolerance, phenology type and functional grain quality. Selection intensity is high, reflecting the relatively high heritability of these traits or the ability to use marker-assisted selection techniques for simply inherited traits.
Typically, selection in the advanced stages occurs in a sequential manner. These stages are referred to as the S1, S2, S3 and S4 stages. The key selection trait in the advanced stages is grain yield. Yield data for each stage are generated from a series of field trials sown at several locations. Trials typically include the set of breeding lines of interest and also some check varieties, which will collectively be termed entries. This paper focuses on S1 and S2 stage trials, and we refer to these as the early-stage trials. The number of entries which are evaluated at the S1 and S2 stages often exceeds 1,000, while at the S3 and S4 stages, the number of entries tested is generally less than 100. At the S1 and S2 stages, resource and seed limitations reduce the numbers of locations and plots that can be used for each breeding line. Replication within a location is often less than two, and the number of locations is usually less than four. For the S3 and S4 stages, entries are evaluated at up to ten locations, with between two and three replicates per location. Given the relatively low heritability for the trait of grain yield, and the presence of variety (entry) by environment interaction, it is critical to adopt efficient experimental designs and appropriate methods of analysis.
In Australia, the preferred approach for the analysis of multi-environment trial (MET) data sets is the factor analytic linear mixed model of Smith et al. (2001). Their original approach considered modelling of the non-genetic effects within each environment using the spatial approach advocated by Gilmour et al. (1997) and modelling the variety by environment effects using a factor analytic model. Their approach did not include genetic relatedness for the variety effects for each environment. Oakey et al. (2006) addressed this for the analysis of a single trial, and later Oakey et al. (2007) incorporated genetic relatedness through the use of ancestral information for the models advocated by Smith et al. (2001). More recently, Smith and Cullis (2018) developed factor analytic selection tools to assist with selection decisions from a factor analytic linear mixed model analysis of MET data sets. These methods are in widespread use for the analysis of MET data sets in the advanced stages of selection in Australia.
There is an extensive literature on the design of field trials for plant breeding programs. Designs used for these trials fall into two broad categories: classical or optimal model-based designs. The former include complete and incomplete block designs, row-column designs or α-designs (see Bailey 2008;Williams 1995, 1998;Patterson and Williams 1976, for example). The principle of model-based design is to search the design space for a design function (Bailey 2008), which is near optimal under a prespecified model. The model used for the design of field trials for plant breeding programs is usually a linear mixed model, which is consistent with the linear mixed model used for the analysis. Early work on modelbased design for plant breeding selection trials focussed on methods to find optimal designs for spatially dependent data (Martin 1986;Martin et al. 2006;Martin and Eccleston 1992;Chan 1999). More recently, model-based designs have been considered for more general linear mixed models which include spatial dependence and blocking factors (Butler et al. 2008;Coombes 2002;Williams and John 2006). Cullis et al. (2006) introduced p-rep designs to address the problems associated with the design of S1 and S2 stage trials where the number of plots for each breeding line in a trial is less than two. They showed that p-rep designs improved the accuracy of selection of breeding lines compared to so-called grid-plot designs (Kempton 1982). The p-rep designs are in widespread use in most plant breeding programs in Australia. These designs can be produced efficiently using statistical software such as the DiGGeR package (Coombes 2009). Williams et al. (2011) developed augmented p-rep designs by combining p-rep and augmented check plot designs.
In order to align model-based design with current methods for the analysis of S1, S2 and S3 stage field trials, recent work has focused on the inclusion of genetic relatedness through the use of ancestral information. Bueno Filho and Gilmour (2003) considered the construction of non-resolvable incomplete block designs when the treatments effects are correlated. Their study considered three choices of genetic relatedness in a simplistic setting of six treatments in blocks of size four. Within the limited scope of their study, they concluded that for most situations, it would be reasonable to use a design which is optimal for unrelated treatment effects. Piepho and Williams (2006) investigated three types of design for a simple genetic structure for the set of treatment effects. On the basis of a simulation experiment, they concluded that the assumption of correlated treatment effects was superior to assuming fixed treatment effects. They also recommended that an unrestricted α-design with a simple family-based treatment structure could be used for the design of field trials in plant breeding programs. Butler et al. (2014) presented a more general approach for the design of field trials in a plant breeding program. Their approach was based on finding designs which were (near) optimal under a linear mixed model which partitioned the total genetic variance into additive and non-additive effects for inbred crops or simpler models for non-inbred crops which only included additive effects. Models for plot effects (Bailey 2008) were either classical (using fixed or random block effects) or based on spatial dependence or a mixture of both. Butler et al. (2014) illustrated their methods using two examples from S1 and S2 stage trials from a canola and sorghum breeding program, respectively. Their designs were generated using the OD-V1 statistical software package (Butler 2013).
There remain at least two issues which hinder the adoption of the Butler et al. (2014) designs for selection experiments in plant breeding programs. Firstly, there is a lack of empirical evidence regarding the improvement in accuracy relative to other classical or model-based designs which are currently in use. Secondly, the computational load for the design search is prohibitive for trials with a larger number of entries. The aim of this paper is to therefore address these issues. The issue of computational burden is addressed by development of an updating algorithm for evaluation of the optimality criterion, which is an extension of the algorithm described by Martin and Eccleston (1992) and Chan (1999), to allow for correlated treatment effects. Secondly, the empirical advantage of model-based designs with correlated effects is compared to two other commonly used designs using an in silico study, based on a large set of field trials from a plant breeding program based in Australia.
The paper is arranged as follows. In Sect. 2, we present a detailed analysis of a case study based on a set of early stage trials grown in 2018 by the four publicly funded pulse breeding programs in Australia. In Sect. 3, we present an approach to model-based design, including derivation of the updating formula for correlated treatment effects which addresses the computational burden associated with calculating the optimality criterion for each interchange in the design search. In this section, we also demonstrate how to obtain a near-optimal design using the R package OD-V2 (Butler and Cullis 2018) and provide some results on the reduction in computing time using the updating formula. We conclude in Sect. 4 with an in silico experiment designed to assess the performance of model-based designs using genetic relatedness and specific non-genetic models.

CASE STUDY
Given that the model used for model-based design is "usually chosen to be as close as possible to that expected for the analysis" (Butler et al. 2014), a case study involving the analysis of 34 early stage trials was conducted. Further, the case study is fundamentally important to providing unequivocable evidence of the need to incorporate genetic relatedness in the design of field trials for plant breeding programs. Cullis et al. (2006) developed p-rep designs with the use of genetic relatedness in mind. However, they stated that pedigree information was not used for the analysis of early generation variety trials at that time. Further extensions of p-rep designs do not consider the use of genetic relatedness (Williams et al. 2011), while many other recent approaches to the design of field trials for plant breeding programmes have also ignored the use of pedigree information (see, for example, Piepho et al. 2018Piepho et al. , 2016Williams and Piepho 2018). This is despite the widespread adoption of the approaches of Oakey et al. (2007) for the analysis of both single-site and multi-environment trial plant breeding data sets.
The trials comprised S1 and S2 stage trials from four Australian public pulse breeding programs in 2018. The case study was used to highlight the importance of including information on genetic relatedness in the analysis and to summarize models used for non-genetic effects. Final columns give the total number of entries in the pedigree and the number of entries with phenotypic data

GENETIC MATERIAL, EXPERIMENTAL DESIGNS AND PHENOTYPING
Comprehensive ancestral information was provided for all four programs, and this was used to form numerator relationship matrices (Meuwissen and Luo 1992) that are summarized in Table 1. The median generation number was 8, 5, 10 and 6 for chickpeas, fababeans, field peas and lentils, respectively. The median inbreeding coefficient for entries in all programs is consistent with the high level of inbreeding achieved prior to inclusion in yield evaluation trials. The average genetic relatedness (Dunner et al. 1998) is also high, reflecting the strong familial structure of the breeding lines in each of the programs. It is important to note, however, that there is substantial heterogeneity of the genetic relatedness within each program. Those entries with low genetic relatedness were predominantly older commercial varieties which have not been used as parents in recent crosses, while those entries with high genetic relatedness were either recently released commercial varieties which have been used as parents in many crosses or breeding lines from (full sib) families with a large representation in trials. Table 2 presents a summary of key features of the 34 trials. All trials were laid out in rectangular arrays of plots, indexed by field columns and field rows. Plot dimensions were long and thin, with columns longer than rows. The trait of interest here is grain yield (t/ha). The mean yield varied substantially between trials, with some trials being severely affected by drought. Generally, S1 stage trials had more entries than S2 stage trials but fewer plots per entry. Partially replicated designs  were used for ten out of the 12 S1 stage trials and four out of the 22 S2 stage trials. These so-called p-rep designs were resolvable (or near resolvable) with respect to the entries with more than one plot. The remaining trials were designed as resolvable two or three replicate block designs. In most cases, there were also additional plots of commercially important varieties. The resolvable blocks were aligned either with columns (referred to as column replicate blocks: CRep), aligned with rows (referred to as row replicate blocks: RRep) or in both directions (trials 12 and 13). In all cases, the resolvable column or row replicate blocks spanned multiple columns or rows, respectively. Most designs were constructed by staff within each breeding program using either DiGGeR (Coombes 2009), with default parameter settings, or Agrobase (Mulitze 1990). Pedigree information was not used in the design process for these trials. Designs for trials 1, 11, 14, 15, 22, 27 and 29 were constructed using the methods to be described in the current paper using OD-V2 (Butler and Cullis 2018).

STATISTICAL MODELS FOR ANALYSIS
The approach used for the analysis is similar to that proposed by Oakey et al. (2006). They considered two linear mixed models with differing assumptions regarding the distribution of the (random) genetic effects. They referred to these models as the standard and pedigree models. In both cases, the non-genetic effects are modelled according to the approach devised by Gilmour et al. (1997), who allowed for three possible sources of variation, namely global, extraneous and local. The linear mixed models considered in this paper also include, by default, random terms which respect the design construction process. These included terms for resolvable or near-resolvable column or row replicate blocks for all designs (either one or both as required) and terms for columns and rows when the designs were constructed using model-based approaches. Low-order polynomials in columns and/or rows were not considered. The residual vector was modelled as a separable first-order auto-regressive process (Gilmour et al. 1997).
Following Oakey et al. (2006), the vector of genetic effects is partitioned into additive and non-additive effects given by where it is assumed that and v is the number of entries in the pedigree. The pedigree model estimates σ 2 a and σ 2 e by fitting both terms, while the standard model excludes the additive effects, ignoring genetic relatedness. All analyses were conducted using Version 4 of the ASReml-R package ).

RESULTS OF ANALYSIS
To allow for an unambiguous comparison of the fit of the two genetic models, all terms and variance models not associated with the genetic effects were chosen while fitting the pedigree model and thence retained for the fit of the standard model. Table 3 presents a summary of the residual maximum likelihood (REML) estimates of the variance parameters associated with the non-genetic terms in the random model and the variance model for the residual. Random terms associated with resolvable blocks were always included in the model, while random terms which were non-resolvable, namely columns and rows, were only included as required. The major source of non-genetic variation in the random model terms is due to the (long) columns, being fitted on 33/34 occasions. This result supports the findings in Oakey et al. (2006), who also fitted terms associated with columns on most occasions (11/14 trials). This has important implications for choosing terms to be included in the random model for construction of near-optimal model-based designs (see Sect. 3.5.1).
The REML estimates of the row and column autoregressive parameters are moderate to high for all programs with the estimates for the row dimension being greater than that for the column dimension (which is consistent with the shape of the plots). The REML estimates of these parameters vary between crops but are somewhat lower than those reported by Oakey et al. (2006), since they fitted a measurement error component. Table 4 presents a summary of the REML estimates of the genetic variance parameters and the model-based reliability (Mrode 1995) of the prediction of the total genetic effects. Additive genetic variance is the dominant source of genetic variance in all programs, though the percent additive genetic variance does vary considerably between programs. These findings are consistent with those of Oakey et al. (2006) who reported a median percentage additive genetic variance of 84 for the 14 S3 stage wheat trials. The reliability of the predicted total genetic effects is typical of these trials. The column labelled Ni for the random terms is the number of trials which included the term. The column labelled Nl for the residual autoregressive correlations (Col AR and Row AR) is the number of trials where the ratio of the REML estimate to its asymptotic standard error exceeded 1.5  Figure 1 presents a scatter plot of the log base 10 of the REML likelihood ratio test to test the hypothesis of zero additive genetic variance against the log base 10 of the number of entries with data for 33 trials (one trial was removed due to a near-zero estimate of genetic variance). Critical values for the REML likelihood ratio test were obtained from the lrt.asreml method within ASReml-R , which implements the approach described in Self and Liang (1987). The hypothesis of zero additive genetic variance was strongly rejected for the majority of the trials, with the exceptions associated with trials with low values of genetic variance or small numbers of entries. These results are again consistent with those of Oakey et al. (2006). Bailey (2008) defines a comparative experiment as an experiment in which we are interested only in contrasts between treatments. Early stage trials are comparative experiments which typically have a simple treatment structure. The aim is selection of the subset of breeding lines which are superior and hence will progress to the next stage of testing. An experimental design has three key elements: the plot structure, the treatment structure and the so-called design function. The design function is a function, T which allocates treatments to plots. Bailey (2008) defines the treatment structure as meaningful ways of dividing  Figure 1. Scatter plot of the log base 10 of REML likelihood ratio statistic for the test that σ 2 a = 0, against log base 10 of the number of entries in the trial. One trial removed due to near-zero genetic variance. The solid horizontal line is located at the nominal 5% critical value for the statistic.

MODEL-BASED DESIGN
up the set of treatments (T ), the plot structure as meaningful ways of dividing up the set of plots ( ), ignoring treatments, and the design function as a function from to T . In classical design, the function T is chosen to satisfy certain combinatorial properties, whereas in model-based design the function T is chosen to result in a design which is optimal or near optimal for a prespecified model. The model considered in this paper is a linear mixed model.

LINEAR MIXED MODEL
In the following, we present a linear mixed model for y, the n-vector of data, which is suitable for the model-based design (and analysis) of a comparative experiment. Nelder (1977) introduced the concept of two aspects of a random effect. He remarked that one kind of random term in a linear model is a component of error, while the other kind of random term represents those effects of interest. The latter type of random term will therefore be in the set of treatment factors, while the former type of random term will be in the set of plot factors. Applying this broad principle, we consider a linear mixed model with four components given by where τ o and τ p are vectors of fixed effects with associated design matrices X o and X p with c x o and c x p columns, respectively; u o and u p are vectors of random effects with associated design matrices Z o and Z p with c z o and c z p columns, respectively, and e is the vector of residuals. The subscripts o and p identify so-called objective and peripheral fixed and random effects, respectively. Equation (2) can be written succinctly as where The random effects and residuals in (2) are assumed to follow a normal distribution such that: where G o , G p and R are positive definite matrices assumed to be functions of vectors of variance parameters σ g o , σ g p and σ r , respectively. Model-based design requires values for these parameters so in the following they are regarded as known.

PREDICTIONS OF INTEREST
The aim is to find an optimal or near-optimal design with respect to a d-vector of estimable The vector of estimable functions, π, involves only objective effects, but may involve fixed, random or both fixed and random effects. Gilmour et al. (2004) provide a computationally efficient algorithm for forming predictions from the linear mixed model specified in (3). For brevity in the following, we use the terminology of Gilmour et al. (2004) and refer to π as the vector of predictions, which we assume are estimable. Gilmour et al. (2004) provide a simple test of estimability of predictions which fit naturally into their prediction algorithm. Briefly, given D, the vector of predictions and associated prediction error variance/covariance matrix are formed by recursive absorption from an extended set of mixed model equations (see Robinson 1991, for example). Full details can be found in Gilmour et al. (2004). The mixed model equations (MMEs) for (3) are given by where the coefficient matrix in (4) is with It follows that the reduced set of MMEs forβ o are given by It can be shown that The matrix P p has rank n − rank X p and is unique, and it is the Moore- For known σ g o , σ g p and σ r where = DC − oo D and C − oo is a particular generalized inverse of the coefficient matrix of (6).

OPTIMAL DESIGN CRITERIA
In the context of early-stage trials, or more generally for comparative experiments, the most widely used and useful optimality criterion is the A-optimality criterion (Martin 1986). Bueno Filho and Gilmour (2007) developed a Bayesian design criterion for selection experiments in plant breeding based on a utility function that minimizes the risk of an incorrect selection. They show that this is in fact the A-optimality criterion based on the PEV matrix for the vector of random entry effects. Bueno Filho and Gilmour (2003) and Cullis et al. (2006) use this criterion for generating optimal or near-optimal designs for plant breeding designs when the treatments are correlated and for so-called p-rep designs used in earlystage trials. In this case, it can be shown that where d ≤ v is the number of varieties for which the A-value is computed and pev () refers to the prediction error variance of its scalar (or matrix) argument. A convenient form for computing A is given by We seek a design which minimizes A over all valid design functions of the design.

DESIGN SEARCH AND UPDATING FORMULAE
The design search process presented in Butler et al. (2014) and implemented in OD-V2 (Butler and Cullis 2018) can be summarized as follows: given an initial design, the A-value is optimized under the supervision of a search strategy (for example, a tabu search Glover 1989) for (pairwise) interchanges of the rows of W o .
Each interchange requires evaluation of the A-value, which is based on the prediction error variance matrix . This requires expensive matrix calculations, and hence, an exhaustive search of the design space for moderate to large problems is implausible. Updating formulae for C − oo can be used to significantly reduce the computational burden. Martin and Eccleston (1992) developed an updating method for finding optimal designs under a linear model with correlated errors. Their algorithm is extended here, but allowing the more general setting of correlated objective effects, within the framework of a linear mixed model. The interchange of two rows of W o is equivalent to first removing two rows from , followed by adding the two units back, but in reverse order. Martin and Eccleston (1997) suggested using a four-step approach which involves adding or removing one unit to obtain the new C − oo . Chan (1999) presented a two-step approach which she claims to be simpler and easier to implement. We begin by developing a two-step approach, similar to that proposed by Chan (1999), but a four-step approach is also presented as this has proven to be competitive to the two-step approach in terms of computational load. The four-step approach is also more suitable for use in the context of early-stage trials, where the presence of so-called singleton treatments (that is, those treatments which occur on only one plot) can cause computational problems as discussed by Coombes (2002).
Let the current design contain n = r + s units, which are referred to as plots in the following. Consider the retention of r plots by removing s plots, and both W o and P p are partitioned conformably with the removal of s plots as follows W o = W or W os and P p = P p;r r P p;rs P p;sr P p;ss where, for example, W or is the design matrix for the subset of the design corresponding to the r retained plots so has r rows and c w o columns. The matrix T is also partitioned conformably with P p . We note that in many cases X p is null in which case P p = V p −1 and hence T = V p . It follows that the coefficient matrix of the reduced MMEs for the subset of the design corresponding to the r retained plots is where T ∼ r r is any particular generalized inverse of T r r . Using a similar argument to Martin and Eccleston (1992) and results on the inverse of partitioned matrices (see, for example, Searle 1982), it can be shown that where F rs = W or P p;rs + W os P p;ss . Hence, an updating formula for C (r)− oo can now be derived using (8) where H * rs = F * rs P − p;ss P p;ss . If P p;ss is non-singular, then H * rs = F * rs . Furthermore, if ( P p;ss + H * rs C (r)− oo H * rs ) is non-singular, then (12) becomes Using the two updating formulae, one for the removal and one for the addition, allows us to compute the new A-value from (r+s) = DC (r+s)− oo D . Setting s = 1 leads to the four-step updating scheme proposed by Martin and Eccleston (1992). The four-step approach can have computational advantages, and additionally, it can be implemented to handle designs in which all of the objective effects are fixed effects and the design contains singletons. Using the updating formulae in (10) and (13), as an example, when s = 1, we have: for p p;ss − h rs C − oo h rs > 0. The two-step approach is straightforward; however, it is useful to illustrate the four-step approach in more detail. As a simple example of the four-step approach, consider interchange of two units in the design, denoted by (ω a , ω b ). The rows of the objective design matrix W o are indexed using such that W o = W o [ , ], and let −a be the set of plots excluding ω a and −b be the set of plots excluding ω b ; then, the four-step approach consists of the following four steps:  (ω a , ω b ), then the steps must be arranged so that the singleton is not removed first.

Example
The implementation of model-based design for early-stage trials is shown here via an example based on one of the breeding programs introduced in Sect. 2. This example also forms the basis of the simulation study presented in Sect. 4. A design is required for testing 256 S1 stage entries from the field peas program with the aim of making selections for progression to S2 stage testing. The trial will also include four check varieties, which is standard practice in most S1 stage field trials. The predictions of interest are the total genetic effects for each of the 260 entries. In the linear mixed model, the total genetic effects will be partitioned into additive and non-additive effects through the use of the numerator relationship matrix for the 260 entries. Summaries of this matrix included a mean inbreeding coefficient of 0.941, while the minimum, mean and maximum genetic relatednesses were 0.130, 0.346 and 0.532. One of the check varieties had the highest mean genetic relatedness, as it had been widely used as a parent of the test lines.
A p-rep design will be constructed, with 50% of the breeding lines having two plots each. (Note that percentage replication will be examined in Sect. 4.) The check varieties will also each have two plots so the full design involves 392 plots that will be arranged as 12 columns by 28 rows. The linear mixed model for the design search will include terms for non-genetic effects as determined from historic analyses of relevant trials. In the current example, this relates to a comprehensive set of S1 and S2 stage trials from the field peas breeding program for the years 2013 to 2017 inclusively. This led to the inclusion of random effects for a twolevel blocking factor CRep (corresponding to columns 1-6 and 7-12); random Column and Row effects and finally a separable first-order autoregressive process for the residuals. The design search requires values of the associated variance parameters, and these will be taken as the median values from the historic analyses.
The design for the example can be constructed in OD-V2 (Butler and Cullis 2018) using the following call: example.od <-od(fixed=˜1, random=˜ric(Entry, Ainv) + CRep + Column + Row, residual =˜ar1(Column):ar1(Row), permute=˜ric(Entry, Ainv), G.param = sv, R.param = sv, maxit=10, search="tabu", data=data.df) In this call, the data frame data.df has 392 rows and corresponds to an initial (user-supplied) design. The genetic effects are specified in the term ric(Entry, Ainv) and relate to the total genetic effects (u g ). The variance function ric() indicates that the variance model for the effects in the factor argument (that is, Entry) is given by σ 2 a A + σ 2 e I 260 where A is the numerator relationship matrix for the 260 entries. Note that in the form given here, the inverse of the numerator relationship matrix, i.e. A −1 is supplied rather than A as it has been stored in sparse form (see Butler 2013). This specification for the total genetic effects will be termed the direct formulation and is a new feature of OD-V2 (Butler and Cullis 2018) that affords computational savings over the alternative, indirect formulation. The latter involves the inclusion of two terms in the model, associated with the additive effects (u a ) and non-additive effects (u e ), so that the resultant vector of objective effects (β o ) has twice as many elements as the direct formulation. The other terms in the random model formula are associated with random CRep, Column and Row effects, respectively. Lastly, the residual model formula specifies that the variance model for the residuals is one associated with a separable first order autoregressive process. The permute model formula specifies that Entry is the factor to be permuted and the use of the direct formulation means that the A-value will be computed for total genetic effects. The variance parameters are supplied using the G.param and R.param arguments (also see "Appendix A'). Table 5 presents the elapsed time (seconds) for a full tabu search and 1000 interchanges for OD-V2 using the direct formulation for total genetic effects: 1000 interchanges for OD-V2 using the indirect formulation for total genetic effects and 1000 interchanges for OD-V1 using the indirect formulation for total genetic effects. The timings were conducted on eight Table 5. Timings for designs for eight trials (trial numbers correspond to those in Table 2): elapsed time (seconds) for a full tabu loop (number of interchanges given in column 3) and for 1000 interchanges for OD-V2 using the direct formulation for total genetic effects; 1000 interchanges for OD-V2 using the indirect formulation for total genetic effects and 1000 interchanges for OD-V1 using the indirect formulation for total genetic effects  Table 5).

Timings for Updating Schemes
The trials included an S1 and S2 stage trial for each of the four crops. The results illustrate the substantial reduction in computational load by using the updating formula developed in Sect. 3.4 and approximately agree with the total number of FLOPs involved, that is, O(c 3 w o ) without an updating scheme compared with O(c w o ) with the updating scheme.

ACCURACY OF DESIGNS USING GENETIC RELATEDNESS
The performance of model-based designs constructed using genetic relatedness and specific non-genetic models was assessed using an in silico experiment based on the example presented in Sect. 3.5.1. Of particular interest were comparisons with model-based designs using specific non-genetic models but no information on genetic relatedness and with more classical approaches to design which use neither specific non-genetic models nor genetic relatedness. Thus, there were three key treatments under investigation, corresponding to three design functions (DF), labelled as DF++, DF−+ and DF−− where the two final characters indicate the use/nonuse of genetic relatedness and specific non-genetic models in the design construction. Further details are provided later.
The simulation experiment was also designed to cover the range of genetic variance parameters and levels of replication in current use for S1 and S2 stage trials in the pulse breeding programs described in Sect. 2. This led to the construction of a total of 63 ancillary treatments to consider, comprising the factorial combinations of the three factors described in Table 6. These factors are labelled Prep (the percentage of breeding lines which have two plots per trial), Padd (the proportion of additive genetic variance expressed as a ratio of the total genetic variance) and Rsq (the baseline reliability of the data for the design with level 0 of the Prep factor). Table 7 presents summaries of the layouts for the seven basic field trial configurations arising from the different replication levels used in the simulation experiment. Every trial Table 6. Summary of the ancillary treatment factors and their levels used in the simulation experiment

Factor
Description Levels

Prep
Percentage of breeding lines replicated p = 0, 5, 10, 15, 25, 50, 100 Padd Proportion of additive genetic variance k = 0.5, 0.7, 0.9 Rsq Baseline reliability r 2 0 = 1/3, 1/2, 2/3 T Prep∧Padd∧Rsq 63 = 7 × 3 × 3 The final column (Tworep) provides the number of entries, including the four check varieties, which had two plots per trial was assumed to be laid out in a contiguous rectangular array with 12 columns and between 22 and 28 rows. The blocking factor CRep was constructed to span two sets of contiguous columns in each case. The linear mixed model chosen to generate the non-genetic effects for the data sets used in this study included terms, variance models and variance parameters as determined from the historic analyses discussed in Sect. 3.5.1. The linear mixed model also partitioned the total genetic effects into additive and non-additive effects. Given the set of variance parameters for the non-genetic effects, the two genetic variance parameters in the data generation model were then chosen to be consistent with the levels of the two ancillary treatment factors Padd and Rsq. Table 8 presents the variance parameters for all of the random and residual variance models in the data generation linear mixed model for each level of Padd and Rsq.
Data sets were generated according to the linear mixed model given by where i = 1, . . . , 63 indexes the ancillary treatments, j = 1, . . . , 3 indexes the design functions and k = 1, . . . , N indexes the simulations (N = 4000). For each k (= 1, . . . , N ), u [k] a,i and u [k] e,i are sampled from a normal distribution with mean zero and variance matrices given by σ 2 a,i A and σ 2 e,i I 260 , respectively. Note that the genetic variance components although indexed by the level of T are constant on the levels of Padd and Rsq, with only nine unique combinations as listed in Table 8. The vector η [k] i , i = 1, . . . , 63; k = 1, . . . , N represents the plot effects that is the sum of all the random non-genetic effects and residuals. These are obtained via sampling from a normal distribution with mean zero and variance where ci and r i are scaled variance matrices of order c i and r i for first-order autoregressive processes where c i and r i are the numbers of columns and rows in the trial layout. These matrices are functions of autocorrelation parameters, ρ c and ρ r , for the column and row dimensions, respectively. The design matrices for the fixed and non-genetic random effects are given by X i , Z cr,i , Z c,i and Z r,i where the latter correspond to the terms in the random model formula, namely CRep, Column and Row, respectively. Values for all the variance parameters are set out in Table 8. There is only one fixed effect associated with an overall mean. Lastly, we consider the design matrix for the genetic effects. A separate design matrix, Z i j , i = 1, . . . , 63; j = 1, 2, 3 is formed for each level of the design function factor as described below • DF++ a design function generated from (16) • DF−+ a design function generated from the plot effects component of (16) but with genetic effects assumed to be fixed effects • DF−− a design function generated from the model of Williams et al. (2011) with genetic effects assumed to be fixed effects Hence, there were a total of 63 different designs for DF++ but only seven designs for DF−+ and DF−−, one for each level of Prep. When forming the design function for the DF++ design, for levels 5, 10, 15, 25, 50 of Prep, we used the method of Huang et al. (2013) to subset those breeding lines to be tested with two plots in the trial. All design functions were obtained using OD-V2 , by varying the model formulae for the fixed, random and residual components. The OD-V2 call for DF++ designs was as given in Sect. 3.5.1. Details and scripts for all designs are available in the supplementary materials.
The correlations between the true and predicted total genetic effects for all entries were calculated for each ancillary treatment and design function, (a total of 189 combinations), where the predicted genetic effects were the empirical BLUPs obtained from the fitting of the linear mixed model that corresponded to the data generation model.

SIMULATION STUDY RESULTS
The estimation of the variance parameters is considered first. Summaries of the MSEP, simulation variance and bias (expressed as a % of the true value) for all variance parameters in the model are available in the supplementary materials. Only the results for the additive and total genetic variance parameters are presented here. Figures 2 and 3 present plots of the percentage bias for the additive genetic variance and total genetic variance against the levels of Prep. Significant bias occurs for some combinations of k and r 2 0 . There is consistent, negative bias for k = 0.9 for the additive genetic variance, for all levels of Prep and Rsq. This negative bias is associated with a significant positive bias for the non-additive genetic variance for these combinations, which then results in bias for the total genetic variance, especially for k = 0.9 and r 2 0 = 1/3. Bias for the total genetic variance is acceptable for the four Padd and Rsq combinations of (k = 0.5, r 2 0 = 1/2), (k = 0.5, r 2 0 = 2/3), (k = 0.7, r 2 0 = 1/2) and (k = 0.7, r 2 0 = 2/3) for all levels of Prep. The remaining combinations of Padd and Rsq all exhibit unacceptable bias for total genetic variance, particularly for small values of Prep. Bias for the additive genetic variance is acceptable for all levels of Prep except when k = 0.9. This latter bias is largely a result of the significant positive bias in estimation of the non-additive genetic variance, when it is such a small component of the total genetic variance and is so close to zero (relative to the residual variance). Table 9 presents the mean squared error of prediction (MSEP-CV) for the REML estimate of the additive genetic variance expressed as a coefficient of variation (relative to the true value) for each level of replication of the test lines, the proportion of additive genetic variance as a ratio to the total genetic variance and for baseline heritabilities of r 2 0 = 1/3 and r 2 0 = 1/2. The results for r 2 0 = 2/3 are similar and are omitted. The MSEP-CV for the DF++ design function is consistently lower than the MSEP-CV for the other two design functions. Of particular interest is the ability of the DF++ design function to achieve comparable levels of      The acronyms DF++, DF−+ and DF−− refer to a design function generated from (16), to a design function generated from the plot effects component of (16) but with genetic effects assumed to be fixed effects and to a design function generated from the model of Williams et al. (2011) with genetic effects assumed to be fixed effects, respectively It is immediately clear that there is a consistent and important increase in correlation for DF++ designs relative to both DF−+ and DF−− designs. The advantage is larger for higher values of k and lower values of r 2 0 . The DF++ designs are superior for all levels of Prep, though the advantage diminishes for designs with Prep less than 10%. It is also noteworthy that the advantage of DF++ designs for values of Prep greater than 25% is stable, suggesting there is little impact of using the subset selection method of Huang et al. (2013) in terms of improving the accuracy of the design.
The plots also highlight the interaction between the increase in simulation-based correlation and model-based reliability due to increasing the level of replication and the value of k. This suggests that the information coming from the data plays an important role in improving the accuracy of the prediction of total genetic effects when the non-additive component is present (and can be reliably estimated). The impact of variance parameter estimation on accuracy is clear. All simulation-based correlations are significantly lower than model-based reliabilities, but particularly so for designs with values of Prep less than 10%. This is consistent with the MSEP-CV shown in Table 9.
One final remark is that across all levels of k, r 2 0 and Prep, there is no appreciable difference between design functions DF−+ and DF−−. This may be a result of the choice of variance parameters for the residual process which gave rise to only modest levels of spatial dependence arising from the inclusion of spatial dependence. Chan (1999) suggested that the optimality of a model-based design is robust to the choice of variance parameter settings as long as the model class remains the same which is not entirely consistent with our findings. Further work in this area is required to examine this issue in more detail. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

APPENDIX A: OD TIMING SYNTAX
Below are the calls used to obtain the timings in Table 5. The first sequence of calls provides an estimate of the elapsed time to complete a full tabu search of the design space and 1000 interchanges using OD-V2 and the direct formulation for the total genetic effects. The function od.init sets or displays various options that affect the behaviour of OD-V2. The first call to OD returns a template from which variance parameters for each of the random and residual terms can be specified. These are assigned in the subsequent two lines of the code, and these are supplied in the call to OD-V2 via the R.param and G.param arguments. The permute argument specifies that the column labelled Entry, which contains the entry code, is to be permuted and the optimality criterion is for the total genetic effects for those entries with data. Use of the time argument provides an estimate of the time for a single tabu loop based on a set of random interchanges. If time> 0, the objective criterion is evaluated the specified number of times prior to the optimization step. The elapsed time is reported and used to estimate the execution time for a scavenging tabu loop.
sv <-od(fixed=˜1, random=˜ric(Entry, Ainv) + RRep + Column, residual =˜ar1(Column):ar1(Row), permute=˜ric(Entry, Ainv), start.values = TRUE, data=dat) sv <-sv$vparameters.table sv$Value <-c(sigma.vmg,sigma.ideg, sigma.rrep, sigma.col, sigma,colphi,rowphi) od.options(time=1e3) temp.od <-od(fixed=˜1, random=˜ric(Entry, Ainv) + RRep + Column, residual =˜ar1(Column):ar1(Row), permute=˜ric(Entry, Ainv), R.param = sv, G.param = sv, search = 'tabu',maxit=1,data=dat) The next sequence of calls estimates the elapsed time for 1000 random interchanges using the indirect model, but in this call to OD-V2 the optimality criterion is based on the additive genetic effects. OD-V2 does not allow the user to specify the calculation of the optimality criterion for the total genetic effects using the indirect formulation for total effects. The pipe operator (|) is used to associate the permute factor with other terms in the model that should be permuted in parallel. The performance of the indirect formulation for the total genetic effects was found to be substantially inferior to the direct formulation, and hence it has been deprecated. Therefore, the elapsed times using these calls will not be exactly equivalent to the use of the indirect formulation within the updating scheme, but trials suggest that they are adequate to ascertain the approximate penalty associated with use of the indirect formulation, which is the only form available in OD-V1 for total genetic effects.