Skip to main content

Practical Workflow from High-Throughput Genotyping to Genomic Estimated Breeding Values (GEBVs)

  • Protocol
  • First Online:
Crop Breeding

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2264))

Abstract

The global climate is changing, resulting in significant economic losses worldwide. It is thus necessary to speed up the plant selection process, especially for complex traits such as biotic and abiotic stresses. Nowadays, genomic selection (GS) is paving new ways to boost plant breeding, facilitating the rapid selection of superior genotypes based on the genomic estimated breeding value (GEBV). GEBVs consider all markers positioned throughout the genome, including those with minor effects. Indeed, although the effect of each marker may be very small, a large number of genome-wide markers retrieved by high-throughput genotyping (HTG) systems (mainly genotyping-by-sequencing, GBS) have the potential to explain all the genetic variance for a particular trait under selection. Although several workflows for GBS and GS data have been described, it is still hard for researchers without a bioinformatics background to carry out these analyses. This chapter has outlined some of the recently available bioinformatics resources that enable researchers to establish GBS applications for GS analysis in laboratories. Moreover, we provide useful scripts that could be used for this purpose and a description of key factors that need to be considered in these approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Heffner EL, Sorrells ME, Jannink J (2009) Genomic selection for crop improvement. Crop Sci 49(1):1–12. https://doi.org/10.2135/cropsci2008.08.0512

    Article  CAS  Google Scholar 

  2. Crossa J, De Los Campos G, Pérez P et al (2010) Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 186:713–724. https://doi.org/10.1534/genetics.110.118521

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Lorenz AJ, Chao S, Asoro FG et al (2011) Genomic selection in plant breeding. Knowledge and prospects. Adv Agron 110:77–123. https://doi.org/10.1016/B978-0-12-3855312.00002-5

    Article  Google Scholar 

  4. Villano C et al (2018) High-throughput genotyping in onion reveals structure of genetic diversity and informative SNPs useful for molecular breeding. Mol Breed 39(1). https://doi.org/10.1007/s11032-018-0912-0

  5. Collard BC, Mackill DJ (2008) Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Philos Trans R Soc Lond Ser B Biol Sci 363:557–572. https://doi.org/10.1098/rstb.2007.2170

    Article  CAS  Google Scholar 

  6. Crossa J, Pérez-Rodríguez P, Cuevas J et al (2017) Genomic selection in plant breeding: methods, models, and perspectives. Trends Plant Sci 22(11):961–975. https://doi.org/10.1016/j.tplants.2017.08.011

    Article  CAS  PubMed  Google Scholar 

  7. Dekkers JCM, Hospital F (2002) The use of molecular genetics in the improvement of agricultural populations. Nat Rev Genet 3:22–32. https://doi.org/10.1038/nrg701

    Article  CAS  PubMed  Google Scholar 

  8. Wang X, Xu Y, Hu Z, Xu C (2018) Genomic selection methods for crop improvement: current status and prospects. Crop J 6:330–340. https://doi.org/10.1016/j.cj.2018.03.001

    Article  Google Scholar 

  9. Heffner EL, Sorrells ME, Jannink JL (2009) Genomic selection for crop improvement, vol 49, pp 1–12. https://doi.org/10.2135/cropsci2008.08.0512

    Book  Google Scholar 

  10. Esposito S, Carputo D, Cardi T, Tripodi P (2019) Applications and trends of machine learning in genomics and phenomics for next-generation breeding. Plants 9(1). https://doi.org/10.3390/plants9010034

  11. Van der Auwera GA et al (2013) From FastQ data to high confidence variant calls: the genome analysis toolkit best practices pipeline. Curr Protoc Bioinformatics 43:11.10.1–11.10.33. https://doi.org/10.1002/0471250953.bi1110s43

    Article  Google Scholar 

  12. Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17:10–12. https://doi.org/10.14806/ej.17.1.200

    Article  Google Scholar 

  13. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. https://doi.org/10.1093/bioinformatics/btp324

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Li H, Handsaker B, Wysoker A et al (2009) Genome project data processing subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Picard Toolkit (2018) Broad Institute, GitHub Repository. http://broadinstitute.github.io/picard/

  16. Catchen J, Hohenlohe P, Bassham S et al (2013) Stacks: an analysis tool set for population genomics. Mol Ecol. https://doi.org/10.1111/mec.12354

  17. Purcell S, Neale B, Todd-Brown K et al (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Hum Genet 81:559–575. https://doi.org/10.1086/519795

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Endelman JB (2011) Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 4:250–255. https://doi.org/10.3835/plantgenome2011.08.0024

    Article  Google Scholar 

  19. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. https://doi.org/10.1093/bioinformatics/btu170

  20. Blankenberg D et al (2010) Manipulation of FASTQ data with Galaxy. Bioinformatics 26:1783–1785. https://doi.org/10.1093/bioinformatics/btq281

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851–1858. https://doi.org/10.1101/gr.078212.108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Lunter G, Goodson M (2011) Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21:936–939. https://doi.org/10.1101/gr.111120.110

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25. https://doi.org/10.1186/gb-2009-10-3-r25

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Li R, Yu C, Li Y et al (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25:1966–1967. https://doi.org/10.1093/bioinformatics/btp336

    Article  CAS  PubMed  Google Scholar 

  25. McKenna A, Hanna M, Banks E et al (2010) The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303. https://doi.org/10.1101/gr.107524.110

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Li R, Li Y, Kristiansen K, Wang J (2008) SOAP: short oligonucleotide alignment program. Bioinformatics 24:713–714. https://doi.org/10.1093/bioinformatics/btn025

    Article  CAS  PubMed  Google Scholar 

  27. Wei Z, Wang W, Hu P, Lyon GJ, Hakonarson H (2011) SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data. Nucleic Acids Res 39:e132. https://doi.org/10.1093/nar/gkr599

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Clement NL, Snell Q, Clement MJ et al (2010) The GNUMAP algorithm: unbiased probabilistic mapping of oligonucleotides from next-generation sequencing. Bioinformatics 26:38–45

    Article  CAS  PubMed  Google Scholar 

  29. Schopp P, Müller D, Technow F, Melchinger AE (2017) Accuracy of genomic prediction in synthetic populations depending on the number of parents, relatedness, and ancestral linkage disequilibrium. Genetics 205:441–454. https://doi.org/10.1534/genetics.116.193243

    Article  CAS  PubMed  Google Scholar 

  30. Edwards SM, Buntjer JB, Jackson R et al (2019) The effects of training population design on genomic prediction accuracy in wheat. Theor Appl Genet 132:1943–1952. https://doi.org/10.1101/443267

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Bassi FM, Bentley AR, Charmet G, Ortiz R, Crossa J (2016) Breeding schemes for the implementation of genomic selection in wheat (Triticum Spp.). Plant Sci 242:23–36. https://doi.org/10.1016/j.plantsci.2015.08.021

    Article  CAS  PubMed  Google Scholar 

  32. Zhang H, Yin L, Wang M et al (2019) Factors affecting the accuracy of genomic selection for agricultural economic traits in maize, cattle, and pig populations. Front Genet 10:189. https://doi.org/10.3389/fgene.2019.00189

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Robertsen CD, Hjotrtshøj RL, Janss LL (2019) Genomic selection in cereal breeding. Agronomy 9:1–16. https://doi.org/10.3390/agronomy9020095

    Article  Google Scholar 

  34. Whittaker JC, Thompson R, Denham MC (2000) Marker-assisted selection using ridge regression. Genet Res 75:249–252. https://doi.org/10.1017/S0016672399004462

    Article  CAS  PubMed  Google Scholar 

  35. Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829

    CAS  PubMed  PubMed Central  Google Scholar 

  36. De los Campos G, Hickey JM, Pong-Wong R et al (2013) Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193:327–345. https://doi.org/10.1534/genetics.112.143313

    Article  PubMed Central  Google Scholar 

Download references

Acknowledgment

The authors thank the BRESOV (Breeding for resilient, efficient and sustainable organic vegetable production) and TomGEM (A holistic multi-actor approach toward the design of new tomato varieties and management practices to improve yield and quality in the face of climate change) projects founded by the European Union Horizon 2020 research and innovation program under grant agreement No. 774244 and No. 679796, respectively. We also thank D’Acunzo D.M. for editing the manuscript.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Contaldi, F., Cappetta, E., Esposito, S. (2021). Practical Workflow from High-Throughput Genotyping to Genomic Estimated Breeding Values (GEBVs). In: Tripodi, P. (eds) Crop Breeding. Methods in Molecular Biology, vol 2264. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1201-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1201-9_9

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1200-2

  • Online ISBN: 978-1-0716-1201-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics