, 215:18 | Cite as

Increasing accuracy and reducing costs of genomic prediction by marker selection

  • Massaine Bandeira e SousaEmail author
  • Giovanni Galli
  • Danilo Hottis Lyra
  • Ítalo Stefanini Correia Granato
  • Filipe Inácio Matias
  • Filipe Couto Alves
  • Roberto Fritsche-Neto


Genotyping costs can be reduced without decreasing the genomic selection accuracy through methodologies of markers subsets assortment. Thus, we compared two strategies to obtain markers subsets. The former uses the primary and the latter the re-estimated markers effects. Moreover, we analyzed each subset via prediction accuracy, bias, and relative efficiency by main genotypic effect model (MGE) fitted, using genomic best linear unbiased predictor linear kernel (GB), and Gaussian nonlinear kernel (GK). All scenarios (subset of markers × kernels models) were applied to a public dataset of rice diversity panel (RICE) and two hybrids maize datasets (HEL and USP). The highest prediction accuracies were obtained by MGE-GB and MGE-GK for grain yield and plant height when we decrease the number of markers. Overall, marker subsets via re-estimated effects method showed a higher relative efficiency of genomic selection. Based on a high-density panel, we can conclude that it is possible to select the most informative markers in order to improve accuracy and build a low-cost SNP chip to implement genomic selection in breeding programs. In addition, we recommend REE (re-estimated effect) strategies to find markers subsets in training population, increasing accuracy of genomic selection.


SNP array subset Relative efficiency Reliability Model-kernel 



We thank Helix Sementes® (São Paulo, Brazil) for the dataset, and Allogamous Plant Breeding Laboratory for the technical and scientific support. Funding was provided by National Council for Scientific and Technological Development (CNPq).

Supplementary material

10681_2019_2339_MOESM1_ESM.jpg (95 kb)
Supplementary material 1 (JPEG 95 kb)
10681_2019_2339_MOESM2_ESM.docx (54 kb)
Supplementary material 2 (DOCX 53 kb)


  1. Bassi FM, Bentley AR, Charmet G et al (2015) Breeding schemes for the implementation of genomic selection in wheat (Triticum spp.). Plant Sci 242:23–36. CrossRefPubMedGoogle Scholar
  2. Bhat JA, Ali S, Salgotra RK et al (2016) Genomic selection in the era of next generation sequencing for complex traits in plant breeding. Front Genet 7:1–11. CrossRefGoogle Scholar
  3. Bian Y, Holland JB (2017) Enhancing genomic prediction with genome-wide association studies in multiparental maize populations. Heredity (Edinb) 118:585–593. CrossRefGoogle Scholar
  4. Browning BL, Browning SR (2008) A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet 84:210–223. CrossRefGoogle Scholar
  5. Crossa J, de los Campos G, Perez-Rodriguez P et al (2010) Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 186:713–724. CrossRefPubMedPubMedCentralGoogle Scholar
  6. Crossa J, Pérez P, Hickey J et al (2013) Genomic prediction in CIMMYT maize and wheat breeding programs. Heredity (Edinb) 112:48–60. CrossRefGoogle Scholar
  7. Crossa J, Pérez-Rodríguez P, Cuevas J et al (2017) Genomic selection in plant breeding: methods, models, and perspectives. Trends Plant Sci 22:961–975CrossRefGoogle Scholar
  8. Cuevas J, Crossa J, Montesinos-Lopez O et al (2016a) Bayesian genomic prediction with genotype × environment interaction kernel models. G3 (Bethesda). CrossRefGoogle Scholar
  9. Cuevas J, Crossa J, Soberanis V et al (2016b) Genomic prediction of genotype × environment interaction kernel regression models. Plant Genome. CrossRefPubMedGoogle Scholar
  10. Daetwyler HD, Pong-Wong R, Villanueva B, Woolliams JA (2010) The impact of genetic architecture on genome-wide evaluation methods. Genetics 185:1021–1031. CrossRefPubMedPubMedCentralGoogle Scholar
  11. de los Campos G, Perez-Rodriguez P (2016) BGLR: Bayesian generalized linear regression. R package version 1.0.5. Accessed 10 Aug 2016
  12. e Souza MB, Cuevas J, de Couto EGO et al (2017) Genomic-enabled prediction in maize using kernel models with genotype × environment interaction. G3 Genes Genomes Genet. CrossRefGoogle Scholar
  13. Endelman JB (2011) Ridge regression and other kernels for genomic selection with R Package rrBLUP. Plant Genome J 4:250–255. CrossRefGoogle Scholar
  14. Gianola D, Fernando RL, Stella A (2006) Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics 173:1761–1776. CrossRefPubMedPubMedCentralGoogle Scholar
  15. Gianola D, Weigel KA, Krämer N et al (2014) Enhancing genome-enabled prediction by bagging genomic BLUP. PLoS ONE. CrossRefPubMedPubMedCentralGoogle Scholar
  16. Gilmour AR, Gogel BJ, Cullis BR, Thompson R (2009) ASReml user guide release 3.0. VSN International, Hemel HempsteadGoogle Scholar
  17. Gorjanc G, Bijma P, Hickey JM (2015a) Reliability of pedigree-based and genomic evaluations in selected populations. Genet Sel Evol 47:65. CrossRefPubMedPubMedCentralGoogle Scholar
  18. Gorjanc G, Cleveland MA, Houston RD, Hickey JM (2015b) Potential of genotyping-by-sequencing for genomic selection in livestock populations. Genet Sel Evol 47:12. CrossRefPubMedPubMedCentralGoogle Scholar
  19. Guo Z, Tucker DM, Basten CJ et al (2014) The impact of population structure on genomic prediction in stratified populations. Theor Appl Genet 127:749–762. CrossRefPubMedGoogle Scholar
  20. Habier D, Fernando RL, Dekkers JCM (2007) The impact of genetic relationship information on genome-assisted breeding values. Genetics 177:2389–2397. CrossRefPubMedPubMedCentralGoogle Scholar
  21. Habier D, Fernando RL, Dekkers JCM (2009) Genomic selection using low-density marker panels. Genetics 182:343–353. CrossRefPubMedPubMedCentralGoogle Scholar
  22. He S, Schulthess AW, Mirdita V et al (2016) Genomic selection in a commercial winter wheat population. Theor Appl Genet 129:641–651. CrossRefPubMedGoogle Scholar
  23. Heslot N, Yang H-P, Sorrells ME, Jannink JL (2012) Genomic selection in plant breeding: a comparison of models. Crop Sci 52:146. CrossRefGoogle Scholar
  24. Hoffstetter A, Cabrera A, Huang M, Sneller C (2016a) Optimizing training population data and validation of genomic selection for economic traits in soft winter wheat. G3 (Bethesda) 6:2919–2928. CrossRefGoogle Scholar
  25. Hoffstetter A, Cabrera A, Sneller C (2016b) Identifying quantitative trait loci for economic traits in an elite soft red winter wheat population. Crop Sci 56:547–558. CrossRefGoogle Scholar
  26. Li B, Zhang N, Wang YG et al (2018) Genomic prediction of breeding values using a subset of SNPs identified by three machine learning methods. Front Genet 9:1–20. CrossRefGoogle Scholar
  27. Ma P, Lund MS, Nielsen US et al (2015) Single-step genomic model improved reliability and reduced the bias of genomic predictions in Danish Jersey. J Dairy Sci 98:9026–9034. CrossRefPubMedGoogle Scholar
  28. Ma Y, Reif JC, Jiang Y et al (2016) Potential of marker selection to increase prediction accuracy of genomic selection in soybean (Glycine max L.). Mol Breed 36:1–10. CrossRefGoogle Scholar
  29. Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829PubMedPubMedCentralGoogle Scholar
  30. Moser G, Khatkar MS, Hayes BJ, Raadsma HW (2010) Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers. Genet Sel Evol 42:37. CrossRefPubMedPubMedCentralGoogle Scholar
  31. Neves HHR, Carvalheiro R, Queiroz SA (2012) A comparison of statistical methods for genomic selection in a mice population. BMC Genet 13:1. CrossRefGoogle Scholar
  32. Ogawa S, Matsuda H, Taniguchi Y et al (2014) Effects of single nucleotide polymorphism marker density on degree of genetic variance explained and genomic evaluation for carcass traits in Japanese Black beef cattle. BMC Genet 15:15. CrossRefPubMedPubMedCentralGoogle Scholar
  33. Perez-Rodriguez P, Gianola D, Gonzalez-Camacho JM et al (2013) Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat. G3 Genes Genomes Genet 2:1595–1605. CrossRefGoogle Scholar
  34. Pérez-Rodríguez P, Gianola D, González-Camacho JM et al (2012) Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat. G3 (Bethesda) 2:1595–1605. CrossRefGoogle Scholar
  35. Porto-Neto LR, Barendse W, Henshall JM et al (2015) Genomic correlation: harnessing the benefit of combining two unrelated populations for genomic selection. Genet Sel Evol 47:84. CrossRefPubMedPubMedCentralGoogle Scholar
  36. Resende MFR, Munoz P, Resende MDV et al (2012) Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics 190:1503–1510. CrossRefPubMedPubMedCentralGoogle Scholar
  37. Solberg TR, Sonesson AK, Woolliams JA, Meuwissen THE (2008) Genomic selection using different marker types and densities. J Anim Sci 86:2447–2454. CrossRefPubMedGoogle Scholar
  38. Spindel J, Begum H, Akdemir D et al (2015) Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS Genet. CrossRefPubMedPubMedCentralGoogle Scholar
  39. Spindel JE, Begum H, Akdemir D et al (2016) Genome-wide prediction models that incorporate de novo GWAS are a powerful new tool for tropical rice improvement. Heredity (Edinb) 116:395–408. CrossRefGoogle Scholar
  40. Su G, Christensen OF, Janss L, Lund MS (2014) Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances. J Dairy Sci 97:6547–6559. CrossRefPubMedGoogle Scholar
  41. Szyda J, Zukowski K, Kamiński S, Zarnecki A (2013) Testing different single nucleotide polymorphism selection strategies for prediction of genomic breeding values in dairy cattle based on low density panels. Czech J Anim Sci 58:136–145CrossRefGoogle Scholar
  42. Tayeh N, Klein A, Le Paslier M-C et al (2015) Genomic prediction in pea: effect of marker density and training population size and composition on prediction accuracy. Front Plant Sci 6:1–11. CrossRefGoogle Scholar
  43. Thomson MJ (2014) High-throughput SNP genotyping to accelerate crop improvement. Plant Breed Biotechnol 2:195–212. CrossRefGoogle Scholar
  44. Unterseer S, Bauer E, Haberer G et al (2014) A powerful tool for genome analysis in maize: development and evaluation of the high density 600 k SNP genotyping array. BMC Genom 15:823. CrossRefGoogle Scholar
  45. VanRaden PM (2007) Genomic measures of relationship and inbreeding. Interbull Annu Meet Proc 37:33–36. CrossRefGoogle Scholar
  46. VanRaden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91:4414–4423. CrossRefPubMedGoogle Scholar
  47. VanRaden PM, Tooker ME, O’Connell JR et al (2017) Selecting sequence variants to improve genomic predictions for dairy cattle. Genet Sel Evol 49:32. CrossRefPubMedPubMedCentralGoogle Scholar
  48. Vazquez AI, Rosa GJM, Weigel KA et al (2010) Predictive ability of subsets of single nucleotide polymorphisms with and without parent average in US Holsteins. J Dairy Sci 93:5942–5949. CrossRefPubMedPubMedCentralGoogle Scholar
  49. Wang Q, Yu Y, Yuan J et al (2017) Effects of marker density and population structure on the genomic prediction accuracy for growth trait in Pacific white shrimp Litopenaeus vannamei. BMC Genet 18:1–9. CrossRefPubMedPubMedCentralGoogle Scholar
  50. Weigel KA, de los Campos G, González-Recio O et al (2009) Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. J Dairy Sci 92:5248–5257. CrossRefPubMedGoogle Scholar
  51. Wimmer AV, Auinger H, Albrecht T et al (2015) synbreed: framework for the analysis of genomic prediction data using R, pp 1–43Google Scholar
  52. Wu XL, Xu J, Feng G et al (2016) Optimal design of low-density SNP arrays for genomic prediction: algorithm and applications. PLoS ONE 11(9):e0161719CrossRefGoogle Scholar
  53. Yu H, Xie W, Li J et al (2014) A whole-genome SNP array (RICE6 K) for genomic breeding in rice. Plant Biotechnol J 12:28–37. CrossRefPubMedGoogle Scholar
  54. Zhang Z, Liu J, Ding X et al (2010) Best linear unbiased prediction of genomic breeding values using a trait-specific marker-derived relationship matrix. PLoS ONE 5:1–8. CrossRefGoogle Scholar
  55. Zhang Z, Erbe M, He J et al (2015) Accuracy of whole genome prediction using a genetic architecture enhanced variance–covariance matrix. G3 Genes Genomes Genet 5:615–627. CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  • Massaine Bandeira e Sousa
    • 1
    Email author
  • Giovanni Galli
    • 1
  • Danilo Hottis Lyra
    • 1
  • Ítalo Stefanini Correia Granato
    • 1
  • Filipe Inácio Matias
    • 1
  • Filipe Couto Alves
    • 1
  • Roberto Fritsche-Neto
    • 1
  1. 1.Department of Genetics, Luiz de Queiroz College of AgricultureUniversity of São PauloPiracicabaBrazil

Personalised recommendations