Introduction

In plant genetics, linkage analysis has been commonly employed to detect quantitative trait loci (QTL). Such approaches have a high power to detect QTL in genome-wide scans (Yu and Buckler 2006). Recently, several attempts have been made for detecting QTL using association mapping methods (e.g., Kraakman et al. 2004; Thornsberry et al. 2001; Vuylsteke et al. 2000) which have the merit of providing a high resolution for QTL detection (Remington et al. 2001). To combine the advantages of both mapping methods, Yu et al. (2008) proposed a nested association mapping (NAM) strategy which uses genome sequence information of recombinant inbred line (RIL) populations derived from several crosses of parental inbreds.

The NAM strategy is based on the idea that the genomes of RILs are mosaics of chromosomal segments of the parental genotypes. Consequently, within chromosomal segments the linkage disequilibrium (LD) information across the parental inbreds is maintained. Thus, if diverse parental inbreds are used, LD decays within the chromosomal segments over a short physical distance (Wilson et al. 2004). Therefore, the QTL mapping strategy using linkage and LD information is expected to have not only a high power to detect QTL in genome-wide approaches but also a high mapping resolution when both linkage and LD information are used.

Exploitation of the advantages of the NAM strategy requires developing, genotyping, and phenotyping of RIL populations from several crosses of diverse parental inbreds. However, this necessitates large financial resources (cf., Yu et al. 2008). Therefore, it is mandatory that the available resources are spent in an optimum way.

Stich et al. (2007) compared RIL populations derived from various mating designs with respect to their power for detecting three-way epistatic interactions in maize. The power 1 − β* is expected to be influenced not only by the mating design from which RIL populations were derived, but also by the number of RILs as well as the heritability on an entry mean basis. To our knowledge, in the context of QTL detection, no study has so far examined the optimum allocation of resources with respect to the number of RILs as well as the number of environments and replications per environment used for phenotypic evaluation.

In the present study, we used computer simulations to optimize for three different fixed budgets the power of QTL detection 1 − β* of RIL populations derived from a nested design by varying (1) the genetic complexity of the trait, (2) the costs for developing, genotyping, and phenotyping RILs, (3) the total number of RILs, and (4) the number of environments and replications per environment used for the phenotypic evaluation of RILs.

Materials and methods

Simulations

Our computer simulations were based on empirical data of 653 single nucleotide polymorphism (SNP) markers of 26 diverse maize inbred lines, namely B73, B97, CML52, CML69, CML103, CML228, CML247, CML277, CML322, CML333, Hp301, IL14H, Ki3, Ki11, Ky21, M37W, M162W, Mo18W, MS71, NC350, NC358, Oh7b, Oh43, P39, Tx303, and Tzi8. These inbreds were selected on the basis of 100 simple sequence repeat markers out of a worldwide sample of 260 maize inbreds to capture the maximum genetic diversity (Liu et al. 2003).

Mating design evaluated

In our simulations, a population of RILs was derived from each of the N C = 25 crosses between B73 and the 25 diverse inbreds, where the total number of RILs was N. This mating design corresponds to that applied in the project “Molecular and Functional Diversity of the Maize Genome” to establish the NAM population (Yu et al. 2008), for which seeds are now available from the Maize Genetics Cooperation Stock Center (E. Buckler, personal communication). The number of RILs per cross N P was calculated as follows: in scenarios with rN mod N C = 0, N P = N/N C. In contrast, in scenarios with r ≠ 0, we chose for r populations N P N/N C + 1, whereas for the remaining N C − r populations N PN/N C.

An interesting property of the above described mating design is that with common-parent-specific (CPS) markers genotyped for the parental inbreds and the RILs, the inheritance of chromosome segments nested within two adjacent CPS markers can be inferred through linkage. Genotyping the founders with additional high-density markers enables the projection of genetic information, capturing LD information, from the parental inbreds to the RILs. This approach is expected to allow high-resolution QTL mapping with a relatively low number of markers in the RILs. However, this strategy is not straightforward to implement for NAM populations, which were derived from other mating designs than the above-described one. Thus, the projection of genetic information based on CPS markers was neglected in our study to facilitate conclusions which are not restricted to the mating design proposed by Yu et al. (2008).

Economic framework and quantitative–genetic parameters

Our simulations assumed a total budget B for (1) developing, (2) genotyping, and (3) phenotyping RILs: B = N(C dev + C geno + ERC fp) + EC env, where R is the number of replications at each of the E environments. C dev, C geno, and C fp are the costs for (1) developing one RIL, (2) genotyping one RIL, and (3) testing one field plot, respectively, and C env are the fixed costs for conducting a field test in one environment. We examined B = 1.25, 2.5, 5 million $. Furthermore, we restricted our simulations to E = 1, 4, 7,…, 19 and R = 1, 2, 3 because higher values are unrealistic in a plant breeding context (cf., Schön et al. 2004). For the standard scenario of costs, we assumed C dev = 30 $, C fp = 15 $, and C env = 25,000 $ (W. Schipprack, personal communication). In view of the fast progress of genotyping and sequencing techniques (Churchill et al. 2004, Shendure et al. 2004), we assumed in our study that all RILs are genotyped with such a high number of markers that each QTL has a marker which is in complete LD with the QTL. We considered it realistic to assume that in the near future, genotyping with such a high number of markers will be available (C geno) for 1,000 $ (Shendure et al. 2004). Eight additional cost scenarios were considered with C dev, C fp, C env being halfed and doubledas well as C geno being quartered and quadrupled (Table 2). Based on the values assumed for E, R, C dev, C geno, C fp, and C env, the total number of RILs N was calculated.

We assumed the proportions among variance components as σ 2g 2ge 2 e  = 1:0.5:1 (VC2), where σ 2g refers to the genotypic variance, σ 2ge to the variance of genotype × environment interactions, and σ 2e to the error variance. This ratio was chosen based on the analysis of variance of traits of presumably medium genetic complexity such as grain moisture (Melchinger et al. 1998). However, to cover with our simulations a wide range of quantitative traits, two additional scenarios were considered with interaction and error variance being halved (VC1) and doubled (VC3) in comparison with VC2. Based on ER, and VC, heritability on an entry mean basis h 2 was calculated as

$$ h = {\frac{{\sigma_{{\text{g}}}^{2} }}{{\sigma_{{\text{g}}}^{2} + \sigma_{{{\text{ge}}}}^{2}/E + \sigma_{{{\text{ge}}}}^{2}/(ER)}}}.$$

Definition of genotypic and phenotypic values

A total of 100 simulation runs was performed for each examined scenario. For each run, three subsets of SNPs (l = 25, 50, 100) were sampled at random without replacement from the linkage map and were defined as QTL. The SNP markers of our study are bi-allelic and, thus, the 25 diverse parental inbreds carry either the same allele as the reference parent B73 or the non-B73 allele. At each QTL, one allele was assigned the genotypic effect zero, whereas the genotypic effect of the other allele was drawn randomly without replacement from the geometric series l(1 − a)[1, aa 2a 3,..., a l−1], with a = 0.90 (25 QTL), a = 0.96 (50 QTL), or a = 0.99 (100 QTL) (Lande and Thompson 1990). Because not all of the 25 diverse inbreds have the non-B73 allele, not every QTL segregates in every population. Genotypic values of the inbreds were determined by summing up the effects of the individual alleles.

From the genotypic values of the progenies of each cross, the genotypic variance of RILs within the cross σ 2g was calculated. For the progenies of each cross, the phenotypic values of the RILs were generated by adding a realisation from a normally distributed variable \({\mathcal N}\left(0,\left(\frac{1 - h^2}{h^2}{\sigma_{g}^2}\right)\right)\) to the genotypic values. The phenotypic values calculated in this way were used for the QTL detection procedure. All simulations were performed with software PLABSOFT (Maurer et al. 2008).

Statistical analyses

The comparison of statistical analyses concerning the power 1 − β* requires an equal experiment-wise error rate α*. To meet this requirement, we applied the following two step procedure for QTL detection. In a first step, stepwise multiple linear regression implemented in PLABQTL (Utz and Melchinger 1996) was used to select a set of cofactors based on the Schwarz Bayesian Criterion (SBC) (Schwarz 1978). We described earlier that we assumed that all RILs are genotyped with such a high number of markers that each QTL has a marker which is in complete LD with the QTL. Therefore, all SNPs, inclusive those treated as QTL, were included in the QTL detection procedures.

In the second step, we calculated a P value for the association of each marker q with the phenotypic value for an F test with a full model against a reduced model:

$$ y = \mu + b_{\rm q} x_{\rm q} + \sum_{c \neq q} b_{\rm c} x_{\rm c} + e, $$

where y is the vector of the phenotypic values of all RILs, μ the intercept, b q (b c) the regression coefficient of the qth marker locus (or cth cofactor), x q (x c) an incidence vector of the genotypes of the RILs at the qth marker (cth cofactor), and e the vector of residual errors.

In our study, each combination of B, C dev, C geno, C fpC env, l, and VC was designated as scenario. For each combination of E and R within a scenario, the nominal α level was chosen in such a way that the experiment-wise error rate α*, which was determined using the knowledge of where the simulated QTL were, was 0.01. Based on this α level, the power for QTL detection (1 − β*) was calculated as the proportion of QTL correctly identified out of the total number of QTL l, where correctly means that the QTL itself was identified as significant in the final model. Subsequently, we identified for each scenario the number of environments E opt as well as the number of replicates R opt for which the maximum power for QTL detection \( 1 - \beta _{{{\text{opt}}}}^{*} \) was observed. The optimum heritability h 2opt as well as the optimum number of RILs N opt arise from E opt, R opt, and B.

Results

The average map distance between the 653 SNP markers was 2.6 cM. The pairwise genetic dissimilarity among the 26 diverse inbreds ranged from 0.58 to 0.75. The average frequency of B73 alleles in the JLAM population was 0.81.

For the standard scenario of costs, E opt ranged across the various examined levels of B and VC from 7 to 19 in the scenarios with l = 25 QTL and R opt varied between two and three (Fig. 1; Table 1). The E opt and R opt values observed for the scenarios with l = 50 and 100 QTL were slightly higher than those observed for the corresponding scenarios with l = 25 QTL. Across the examined levels of l, the increase in the non-genetic variance from VC1 to VC3 resulted in an increase of E opt and R opt. Increasing the budget B from 1.25 to 5 million $ had no influence on E opt and R opt across the examined levels of l and VC.

Fig. 1
figure 1

Power to detect quantitative trait loci (QTL) (1 − β*) at an experiment-wise error rate of 0.01 assuming B = 2.5 millon US dollar. The costs for (1) establishing one RIL (C dev), (2) genotyping one RIL (C geno), and (3) testing one field plot (C fp), and the fixed costs for each environment (C env) were 30, 1,000, 15, and 25,000 $, respectively. R is the number of replicates at each of the E environments and VC is the ratio of variance components assumed. For a detailed definition of the examined parameters see “Materials and methods

Table 1 Number of environments (E opt) and number of replications per environment (R opt) maximizing the power of QTL detection \( (1 - \beta _{{{\text{opt}}}}^{*}) \) at an experiment-wise error rate of 0.01

For the examined levels of B and VC, an increase of l from 25 to 100 reduced \( 1 - \beta _{{{\text{opt}}}}^{*} \) up to 5% of the initial value (Table 1). The increase in the non-genetic variance from VC1 to VC3 decreased \( 1 - \beta _{{{\text{opt}}}}^{*} \) less pronounced than that observed for the increase of l. Increasing the budget B from 1.25 to 5 million $ resulted in an increase of \( 1 - \beta _{{{\text{opt}}}}^{*} \) for all examined levels of l.

For budgets B of 1.25 and 5 million $, the increase of C dev resulted in an increase of E opt (Table 2). The increase of C geno resulted for all examined budgets B in an increase of E opt and/or R opt. For all examined budgets B, the increase of C fp resulted in a decrease of E opt, where the increase of C env resulted in a decrease of E opt and an increase of R opt.

Table 2 Number of environments (E opt) and number of replications per environment (R opt) maximizing the power of QTL detection \( (1 - \beta _{{{\text{opt}}}}^{*}) \) at an experiment-wise error rate of 0.01

Across all examined levels of B, the increase of C dev from 15 to 60 $ did not alter \( 1 - \beta _{{{\text{opt}}}}^{*} \) (Table 2). In contrast, a medium decrease of \( 1 - \beta _{{{\text{opt}}}}^{*} \) was observed for an increase of C geno from 200 to 5,000 $, whereas the decreases of \( 1 - \beta _{{{\text{opt}}}}^{*} \) were of small size upon alteration of C fp and C env from the low to the high level.

Discussion

Assumptions underlying the simulations

We assumed that all RILs are genotyped with such a high number of markers that each QTL has a marker which is in complete LD with the QTL. Empirical studies based on a NAM population of maize, however,might require a higher number of marker than was used in our study. In this case, it might be necessary to choose a significance threshold that is lower than that of the current study, which in turn reduces 1 − β*. However, this reduction of power estimates is expected to be of similar size for all scenario and, thus, no influence on E opt and R opt is expected.

On the genome-wide level, the most common polymorphisms such as SNPs and insertion/deletions are biallelic (e.g., Cho et al. 1999). Therefore, it is also expected that functional polymorphisms are predominantly biallelic. However, several, tightly linked, biallelic polymorphisms can build up multiple alleles present at QTL (e.g., Harjes et al. 2008). Because such information was not available for the maize inbreds used in our study, we neglected this possibility and assumed the presence of only two alleles at each QTLacross the entire set of maize inbreds. This assumption is expected to increase 1 − β* compared to a scenario with multiple QTL alleles. However, preliminary simulations (data not shown) suggested no or only marginal effects on E opt and R opt.

Power to detect QTL

For scenarios with similar l and h, the power 1 − β* observed in our study was considerably higher than that of Yu et al. (2008). This difference is most likely due to the fact that in our study the phenotypic values were simulated based on population-based heritability estimates. In contrast, in the study of Yu et al. (2008) phenotypic values used for the QTL detection procedure were simulated based on one heritability estimate across all 25 populations. In the latter case, the large difference between the means of the different populations will lead to a low population-based heritability which furthermore reduces 1 − β*. The above-described difference in an assumption leads to the fact that the power estimates of our study and that of Yu et al. (2008) cannot directly be compared although the scenarios seem to be similar with respect to l and h. Furthermore, the different criteria and/or thresholds applied for QTL detection might further contribute to this difference in power estimates.

Mating designs for establishing RIL populations

The results of Stich et al. (2007) revealed considerable differences between RIL populations derived from diallel and different partial diallel designs with respect to the power for detection of three-way epistatic interactions. Due to the high computational burden, however, the simulations of the current study had to be restricted to one mating design. We examined RIL populations derived from a nested design, because this design was used in maize to develop the NAM population (McMullen et al. 2009). Nevertheless, preliminary simulations indicated that the findings of our study are not restricted to RIL populations derived from specific mating designs.

Optimum allocation of resources for NAM

In studies using the latest genomics tools for dissection of quantitative traits, phenotypic information is often generated with low intensity (cf., Aranzana et al. 2005) resulting in low heritabilities on an entry mean basis. However, 1 − β* can be considerably increased by increasing the heritability (cf., Stich et al. 2007). In studies with a fixed budget, however, this implies a reduction in the number of RILs which in turn reduces 1 − β*. Therefore, in studies with a fixed budget, it is an important issue to determine the number of RILs and the corresponding intensity of their phenotypic evaluation for maximizing 1 − β*.

In our simulations, considerable differences were observed between the 1 − β* estimates of experiments with optimally and sub-optimally allocated resources (Fig. 1). This finding is in accordance with results of Knapp and Bridges (1990) who examined based on theoretical considerations the optimum allocation of resources in a linkage mapping context. These results illustrated the potential of improving the QTL detection power without increasing the total resources required for a QTL mapping experiment.

Knapp and Bridges (1990) suggested that in scenarios with residual genetic variance, i.e. not all QTL are detected, 1 − β* is maximized by maximizing the number of genotypes in the segregating population and using one replicate for phenotypic evaluation. This finding, however, is based on the assumption that one replicate is as expensive as one additional genotype. Because this is not true for most experimental situations, we discuss in the following all factors which have the potential to influence the optimum allocation of resources: (1) genetic architecture of the trait, (2) total budget, and (3) costs for establishing, phenotyping, and genotyping RILs.

Genetic architecture of the trait

Our results suggest that in the scenarios with 25 QTL the optimum number of environments E opt as well as the optimum number of replications R opt were slightly lower than those observed for the scenarios with 50 and 100 QTL (Table 1). This finding might be explained by the fact that in the latter scenarios the proportion of the genotypic variance explained by single QTL is considerably lower than in the former scenario. Detecting QTL which explain a low proportion of the genotypic variance, however, requires a higher heritability and consequently more intensive phenotypic evaluation than detecting QTL which explain a high proportion of the genotypic variance.

In addition to the number of QTL l, the genetic architecture of a trait is characterized by the ratio of variance components. Our simulations suggest that across the examined levels of l, E opt and R opt increase with increasing non-genetic variance. This finding might be due to the higher values of E and R, which are required in a scenario with high non-genetic variance in order to obtain a heritability similar to that of a scenario with low non-genetic variance.

In conclusion, our findings suggest that the optimum allocation of resources differs among different traits, which is in accordance with findings of Schön et al. (2004). However, given the high effort to establish NAM populations, the goal is to examine traits of different genetic complexity in the same experiment. Thus, the optimization of the experimental design cannot be performed separately for each trait but must take into account several traits simultaneously.

We detected a high absolute value of the slope of the power curve for values of E < E opt, whereas this value was low for values of E > E opt (Fig. 1). This observation suggested that it is more promising to use a number of environments E which is higher than the optimum than using an E value which is lower than E opt. Furthermore, we observed for traits of low genetic complexity smaller values for E opt and R opt than for traits with high genetic complexity. These findings suggested that it is most promising to optimize the design of NAM populations with respect to the trait with the highest genetic complexity.

Total budget

Our results indicated that in contrast to increasing the genetic complexity of the trait, E opt and/or R opt are hardly affected by changes of the total budget B. This finding might be explained by the fact that even for the low budget B of 1.25 million $ the heritability is not the parameter limiting the power for detection of QTL 1 − β*.

Costs for establishing, genotyping, and phenotyping RILs

Owing to the high computational burden, the influence of altering the costs for establishing, genotyping, and phenotyping RILs was examined only for a trait of medium genetic complexity

The results of our simulations suggested that the increase of C dev resulted in an increase of E opt. Similarly, the increase of C geno resulted for all examined budgets B in an increase of E opt and/or R opt. These observations might be explained by the fact that with the increase of these two costs the price for each RIL increases and, therefore, the phenotyping with a higher intensity becomes advantageous. The opposite explanation is true for the increase of C fp which resulted in a decrease of E opt. The increase of C env which resulted in a decrease of E opt and an increase of R opt might be due to the fact that in this situation the substitution of environments by replications becomes advantageous. Consequently, the results of our simulations suggest that variation of the costs for establishing, genotyping, and phenotyping RILs in a framework, which is realistic in a plant breeding context, has considerable effects on E opt and R opt and, thus, should be considered when planning a QTL mapping experiment.

Comparison with the optimum allocation of resources observed for plant breeding programs

In contrast to the optimum allocation of resources in the context of QTL detection, several studies examined the optimum allocation of resources in plant breeding programs of various crops (e.g., Longin et al. 2006; Tomerius et al. 2008). In such studies, the optimum number of replications R opt was typically one and the optimum number of environments E opt was mostly smaller than 10. The observation that in our study the values for R opt and E opt were substantially higher (Tables 1, 2) might be explained by the fact that for high-resolution QTL mapping, a higher number of marker has to be genotyped. Despite the fact that we assumed genotyping costs C geno which might be realistic in the near future, in our study, between one-third and one-half of the total budget is used for genotyping which in turn makes it advantageous to increase E opt and R opt. This finding indicated that also in studies using the latest genomic tools to dissect quantitative traits, it is required to evaluate the individuals of the mapping population in a high number of environments with a high number of replications per environment.

Conclusions

Our finding of differences in 1 − β* estimates between experiments with optimally and sub-optimally allocated resources illustrated the potential to improve the power for QTL detection without increasing the total resources necessary for a QTL mapping experiment. However, the results of our study suggest that there is no single best allocation for every NAM experiment, but E opt and R opt are strongly influenced by the underlying values of VC, C dev, C geno, C fp, and C env. In contrast, B and l have only marginal effects on the optimum allocation of resources.