Skip to main content
Log in

Novel strategies for genomic prediction of untested single-cross maize hybrids using unbalanced historical data

  • Original Article
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

Key message

Weighted outperformed unweighted genomic prediction using an unbalanced dataset representative of a commercial breeding program. Moreover, the use of the two cycles preceding predictions as training set achieved optimal prediction ability.

Abstract

Predicting the performance of untested single-cross hybrids through genomic prediction (GP) is highly desirable to increase genetic gain. Here, we evaluate the predictive ability (PA) of novel genomic strategies to predict single-cross maize hybrids using an unbalanced historical dataset of a tropical breeding program. Field data comprised 949 single-cross hybrids evaluated from 2006 to 2013, representing eight breeding cycles. Hybrid genotypes were inferred based on their parents’ genotypes (inbred lines) using single-nucleotide polymorphism markers obtained via genotyping-by-sequencing. GP analyses were fitted using genomic best linear unbiased prediction via a stage-wise approach, considering two distinct cross-validation schemes. Results highlight the importance of taking into account the uncertainty regarding the adjusted means at each step of a stage-wise analysis, due to the highly unbalanced data structure and the expected heterogeneity of variances across years and locations of a commercial breeding program. Further, an increase in the size of the training set was not always advantageous even in the same breeding program. The use of the two cycles preceding predictions achieved optimal PA of untested single-cross hybrids in a forward prediction scenario, which could be used to replace the first step of field screening. Finally, in addition to the practical and theoretical results applied to maize hybrid breeding programs, the stage-wise analysis performed in this study may be applied to any crop historical unbalanced data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Albrecht T, Auinger HJ, Wimmer V, Ogutu JO, Knaak C, Ouzunova M, Piepho HP, Schön CC (2014) Genome-based prediction of maize hybrid performance across genetic groups, testers, locations, and years. Theor Appl Genet 127(6):1375–1386

    Article  PubMed  Google Scholar 

  • Auinger HJ, Schönleben M, Lehermeier C, Schmidt M, Korzun V, Geiger HH, Piepho HP, Gordillo A, Wilde P, Bauer E et al (2016) Model training across multiple breeding cycles significantly improves genomic prediction accuracy in rye (secale cereale l.). Theor Appl Genet 129(11):2043–2053

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bernal-Vasquez AM, Utz HF, Piepho HP (2016) Outlier detection methods for generalized lattices: a case study on the transition from anova to reml. Theor Appl Genet 129(4):787–804

    Article  PubMed  Google Scholar 

  • Bernal-Vasquez AM, Gordillo A, Schmidt M, Piepho HP (2017) Genomic prediction in early selection stages using multi-year data in a hybrid rye breeding program. BMC Genet 18:1–17

    Article  Google Scholar 

  • Bernardo R (1991) Correlation between testcross performance of lines at early and late selfing generations. Theor Appl Genet 82(1):17–21

    Article  CAS  PubMed  Google Scholar 

  • Bernardo R (1994) Prediction of maize single-cross performance using rflps and information from related hybrids. Crop Sci 34:20–25

    Article  Google Scholar 

  • Browning BL, Browning SR (2016) Genotype imputation with millions of reference samples. Am J Hum Genet 98(1):116–126

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Burgueño J, de los Campos G, Weigel K, Crossa J (2012) Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Sci 52(2):707–719

    Article  Google Scholar 

  • Butler DG, Cullis BR, Gilmour AR, Gogel BJ (2009) ASReml-R reference manual

  • de los Campos G, Hickey JM, Pong-Wong R, Daetwyler HD, Calus MP (2013) Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193(2):327–345

    Article  PubMed  PubMed Central  Google Scholar 

  • Charcosset A, Lefort-Buson M, Gallais A (1991) Relationship between heterosis and heterozygosity at marker loci: a theoretical computation. Theor Appl Genet 81(5):571–575

    Article  CAS  PubMed  Google Scholar 

  • Cros D, Tchounke B, Nkague-Nkamba L (2018) Training genomic selection models across several breeding cycles increases genetic gain in oil palm in silico study. Mol Breed 38(7):1–12

    Article  CAS  Google Scholar 

  • Crossa J, Beyene Y, Kassa S, Pérez P, Hickey JM, Chen C, De Los Campos G, Burgueño J, Windhausen VS, Buckler E, et al (2013) Genomic prediction in maize breeding populations with genotyping-by-sequencing. G3: Genes, Genomes, Genetics pp 1903–1926

  • Crossa J, Pérez-Rodríguez P, Cuevas J, Montesinos-López O, Jarquín D, de los Campos G, Burgueño J, González-Camacho JM, Pérez-Elizalde S, Beyene Y et al (2017) Genomic selection in plant breeding: methods, models, and perspectives. Trends Plant Sci 22(11):961–975

    Article  CAS  PubMed  Google Scholar 

  • Cullis BR, Smith A, Hunt C, Gilmour A (2000) An examination of the efficiency of australian crop variety evaluation programmes. J Agric Sci 135(3):213–222

    Article  Google Scholar 

  • Damesa TM, Möhring J, Worku M, Piepho HP (2017) One step at a time: stage-wise analysis of a series of experiments. Agron J 109(3):845–857

    Article  Google Scholar 

  • Desrousseaux D, Sandron F, Siberchicot A, Cierco-Ayrolles C, Mangin B (2017) LDcorSV: linkage disequilibrium corrected by the structure and the relatedness. R package version 1.3.2. https://CRAN.R-project.org/package=LDcorSV

  • Dias KODG, Gezan SA, Guimares CT, Nazarian A, Silva LC, Parentoni SN, Guimares PEdO, Anoni CdO, Noda RW, Ribeiro CAG, Magalhes JV, Garcia AAF, Souza JC, Guimares LJM, Pastina MM (2018) Improving accuracies of genomic predictions for drought tolerance in maize by joint modeling of additive and dominance effects in multi-environment trials. Heredity 121:24–37

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE (2011) A robust, simple genotyping-by-sequencing (gbs) approach for high diversity species. Plos One 6(5):e19379

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fristche-Neto R, Akdemir D, Jannink JL (2018) Accuracy of genomic selection to predict maize single-crosses obtained through different mating designs. Theor Appl Genet 131(5):1153–1162

    Article  CAS  Google Scholar 

  • Fritsche-Neto R, Gonçalves MC, Vencovsky R, de Souza Junior CL (2010) Prediction of genotypic values of maize hybrids in unbalanced experiments. Crop Breed Appl Biotechnol 10(1):32–39

    Article  Google Scholar 

  • Garcia AA, Benchimol LL, Barbosa AM, Geraldi IO, Souza CL Jr, Souza APd (2004) Comparison of rapd, rflp, aflp and ssr markers for diversity studies in tropical maize inbred lines. Genet Mol Biol 27(4):579–588

    Article  CAS  Google Scholar 

  • Garrick DJ, Taylor JF, Fernando RL (2009) Deregressing estimated breeding values and weighting information for genomic regression analyses. Genet Sel Evol 41:1–8

    Article  Google Scholar 

  • Gezan SA, de Carvalho MP, Sherrill J (2017) Statistical methods to explore genotype-by-environment interaction for loblolly pine clonal trials. Tree Genet Genomes 13(1):1–11

    Article  Google Scholar 

  • Gianola D, de los Campos G, Hill WG, Manfredi E, Fernando R (2009) Additive genetic variability and the bayesian alphabet. Genetics 183(1):347–363

    Article  PubMed  PubMed Central  Google Scholar 

  • Gilmour AR, Thompson R, Cullis BR (1995) Average information reml: an efficient algorithm for variance parameter estimation in linear mixed models. Biometrics 51:1440–1450

    Article  Google Scholar 

  • Glaubitz JC, Casstevens TM, Lu F, Harriman J, Elshire RJ, Sun Q, Buckler ES (2014) Tassel-gbs: a high capacity genotyping by sequencing analysis pipeline. Plos One 9(2):e90346

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gorjanc G, Gaynor RC, Hickey JM (2018) Optimal cross selection for long-term genetic gain in two-part programs with rapid recurrent genomic selection. Theor Appl Genet 131(9):1953–1966

    Article  PubMed  PubMed Central  Google Scholar 

  • Granato IS, Galli G, de Oliveira Couto EG, e Souza MB, Mendonça LF, Fritsche-Neto R (2018) snpready: a tool to assist breeders in genomic analysis. Mol Breed 38(8):1–7

    Article  CAS  Google Scholar 

  • Guo T, Yu X, Li X, Zhang H, Zhu C, Flint-Garcia S, McMullen MD, Holland JB, Szalma SJ, Wisser RJ et al (2019) Optimal designs for genomic selection in hybrid crops. Mol plant 12(3):390–401

    Article  CAS  PubMed  Google Scholar 

  • Habier D, Fernando R, Dekkers J (2007) The impact of genetic relationship information on genome-assisted breeding values. Genetics 177(4):2389–2397

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Habier D, Fernando RL, Garrick DJ (2013) Genomic blup decoded: a look into the black box of genomic prediction. Genetics 194(3):597–607

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kleinknecht K, Möhring J, Singh K, Zaidi P, Atlin G, Piepho H (2013) Comparison of the performance of best linear unbiased estimation and best linear unbiased prediction of genotype effects from zoned indian maize data. Crop Sci 53(4):1384–1391

    Article  Google Scholar 

  • Li H, Durbin R (2009) Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25(14):1754–1760

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Marulanda JJ, Mi X, Melchinger AE, Xu JL, Würschum T, Longin CFH (2016) Optimum breeding strategies using genomic selection for hybrid breeding in wheat, maize, rye, barley, rice and triticale. Theor Appl Genet 129(10):1901–1913

    Article  CAS  PubMed  Google Scholar 

  • Mendiburu F (2017) Agricolae: statistical procedures for agricultural research. R package version 1.2-8. https://CRAN.R-project.org/package=agricolae

  • Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157(4):1819–1829

    CAS  PubMed  PubMed Central  Google Scholar 

  • Möhring J, Williams ER, Piepho HP (2015) Inter-block information: to recover or not to recover it? Theor Appl Genet 128(8):1541–1554

    Article  CAS  PubMed  Google Scholar 

  • Piepho H, Büchse A, Emrich K (2003) A hitchhiker’s guide to mixed models for randomized experiments. J Agron Crop Sci 189(5):310–322

    Article  Google Scholar 

  • Piepho H, Büchse A, Truberg B (2006) On the use of multiple lattice designs and \(\alpha\)-designs in plant breeding trials. Plant Breed 125(5):523–528

    Article  Google Scholar 

  • Piepho HP (2009) Ridge regression and extensions for genomewide selection in maize. Crop Sci 49(4):1165–1176

    Article  Google Scholar 

  • Piepho HP, Möhring J (2006) Selection in cultivar trialsis it ignorable? Crop Sci 46(1):192–201

    Article  Google Scholar 

  • Piepho HP, Möhring J (2007) Computing heritability and selection response from unbalanced plant breeding trials. Genetics 177(3):1881–1888

    Article  PubMed  PubMed Central  Google Scholar 

  • Piepho HP, Möhring J, Schulz-Streeck T, Ogutu JO (2012) A stage-wise approach for the analysis of multi-environment trials. Biom J 54(6):844–860

    Article  PubMed  Google Scholar 

  • R Core Team (2018) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/, ISBN 3-900051-07-0

  • Rincent R, Laloë D, Nicolas S, Altmann T, Brunel D, Revilla P, Rodriguez VM, Moreno-Gonzales J, Melchinger AE, Bauer E, et al (2012) Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (zea mays l.). Genetics pp 715–728

  • Saghai-Maroof MA, Soliman KM, Jorgensen RA, Allard R (1984) Ribosomal dna spacer-length polymorphisms in barley: Mendelian inheritance, chromosomal location, and population dynamics. Proc Natl Acad Sci 81(24):8014–8018

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • dos Santos JPR, de Castro Vasconcellos RC, Pires LPM, Balestre M, Von Pinho RG (2016) Inclusion of dominance effects in the multivariate gblup model. Plos One 11(4):e0152045

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Schmidt P, Hartung J, Rath J, Piepho HP (2019) Estimating broad-sense heritability with unbalanced data from agricultural cultivar trials. Crop Sci 59(2):525–536

    Article  Google Scholar 

  • Schopp P, Müller D, Technow F, Melchinger AE (2017) Accuracy of genomic prediction in synthetic populations depending on the number of parents, relatedness, and ancestral linkage disequilibrium. Genetics 205(1):441–454

    Article  CAS  PubMed  Google Scholar 

  • Schrag TA, Möhring J, Melchinger AE, Kusterer B, Dhillon BS, Piepho HP, Frisch M (2010) Prediction of hybrid performance in maize using molecular markers and joint analyses of hybrids and parental inbreds. Theor Appl Genet 120(2):451–461

    Article  CAS  PubMed  Google Scholar 

  • Schrag TA, Schipprack W, Melchinger AE (2018) Across-years prediction of hybrid performance in maize using genomics. Theor Appl Genet 131:1–14

    Article  CAS  Google Scholar 

  • Schulz-Streeck T, Ogutu JO, Piepho HP (2013) Comparisons of single-stage and two-stage approaches to genomic selection. Theor Appl Genet 126(1):69–82

    Article  PubMed  Google Scholar 

  • Shull GH (1908) The composition of a field of maize. J Hered 4:296–301

    Article  Google Scholar 

  • Smith A, Cullis B, Gilmour A (2001) Applications: the analysis of crop variety evaluation data in australia. Aust New Zealand J Stat 43(2):129–145

    Article  Google Scholar 

  • Smith O (1986) Covariance between line per se and testcross performance. Crop Sci 26(3):540–543

    Article  Google Scholar 

  • Technow F, Schrag TA, Schipprack W, Bauer E, Simianer H, Melchinger AE (2014) Genome properties and prospects of genomic prediction of hybrid performance in a breeding program of maize. Genetics 197(4):1343–1355

    Article  PubMed  PubMed Central  Google Scholar 

  • VanRaden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91(11):4414–4423

    Article  CAS  PubMed  Google Scholar 

  • Viana JMS, Pereira HD, Mundim GB, Piepho HP, e Silva FF (2018) Efficiency of genomic prediction of non-assessed single crosses. Heredity 120(4):283

    Article  CAS  PubMed  Google Scholar 

  • Vieira I, Dos Santos J, Pires L, Lima B, Gonçalves F, Balestre M (2017) Assessing non-additive effects in gblup model. Genet Mol Res 16(2):1–21

    Article  CAS  Google Scholar 

  • Vitezica ZG, Varona L, Legarra A (2013) On the additive and dominant variance and covariance of individuals within the genomic selection scope. Genetics 195(4):1223–1230

    Article  PubMed  PubMed Central  Google Scholar 

  • Welham SJ, Gogel BJ, Smith AB, Thompson R, Cullis BR (2010) A comparison of analysis methods for late-stage variety evaluation trials. Aust New Zealand J Stat 52(2):125–149

    Article  Google Scholar 

  • Westhues M, Schrag TA, Heuer C, Thaller G, Utz HF, Schipprack W, Thiemann A, Seifert F, Ehret A, Schlereth A et al (2017) Omics-based hybrid prediction in maize. Theor Appl Genet 130(9):1927–1939

    Article  CAS  PubMed  Google Scholar 

  • Windhausen VS, Wagener S, Magorokosho C, Makumbi D, Vivek B, Piepho HP, Melchinger AE, Atlin GN (2012) Strategies to subdivide a target population of environments: Results from the cimmyt-led maize hybrid testing programs in africa. Crop Sci 52(5):2143–2152

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by FAPEMIG (Fundação de Amparo á Pesquisa de Minas Gerais), CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico), CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, program PREMIO 2045/2014, Grant 23038.007195/2012-39), and Embrapa (Brazilian Agricultural Research Corporation). K.O.G. Dias received a Grant from FAPESP (Fundação de Amparo á Pesquisa do Estado de São Paulo, Grant 2016/12977-7 and 2018/00634-3). AAFG has a productivity scholarship from CNPq. The authors thank Jhonathan Santos, Paul Schmidt and Jens Hartung for careful reading and suggestions on the early draft of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

KOGD: Conceptualization, data curation, methodology, formal analysis, writing the original draft. HPP: conceptualization, methodology, formal analysis, revision and editing. LJMG: conceptualization, funding acquisition, resources, data curation, revision and editing. PEOG: funding acquisition, resources, data curation. SNP: funding acquisition, resources, data curation. MOP: funding acquisition, resources, data curation. RWN: resources, data curation. JVM: funding acquisition, resources, data curation, revision and editing. CTG: funding acquisition, resources, data curation, revision and editing. AAFG: supervision, conceptualization, resources, revision and editing. MMP: supervision, conceptualization, funding acquisition, resources, data curation, revision and editing.

Corresponding authors

Correspondence to A. A. F. Garcia or M. M. Pastina.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Communicated by Matthias Frisch.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7676 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dias, K.O.G., Piepho, H.P., Guimarães, L.J.M. et al. Novel strategies for genomic prediction of untested single-cross maize hybrids using unbalanced historical data. Theor Appl Genet 133, 443–455 (2020). https://doi.org/10.1007/s00122-019-03475-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-019-03475-1

Navigation