Practical Workflow from High-Throughput Genotyping to Genomic Estimated Breeding Values (GEBVs)

Contaldi, Felice; Cappetta, Elisa; Esposito, Salvatore

doi:10.1007/978-1-0716-1201-9_9

Felice Contaldi³,
Elisa Cappetta⁴ &
Salvatore Esposito³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2264))

1430 Accesses
6 Citations
1 Altmetric

Abstract

The global climate is changing, resulting in significant economic losses worldwide. It is thus necessary to speed up the plant selection process, especially for complex traits such as biotic and abiotic stresses. Nowadays, genomic selection (GS) is paving new ways to boost plant breeding, facilitating the rapid selection of superior genotypes based on the genomic estimated breeding value (GEBV). GEBVs consider all markers positioned throughout the genome, including those with minor effects. Indeed, although the effect of each marker may be very small, a large number of genome-wide markers retrieved by high-throughput genotyping (HTG) systems (mainly genotyping-by-sequencing, GBS) have the potential to explain all the genetic variance for a particular trait under selection. Although several workflows for GBS and GS data have been described, it is still hard for researchers without a bioinformatics background to carry out these analyses. This chapter has outlined some of the recently available bioinformatics resources that enable researchers to establish GBS applications for GS analysis in laboratories. Moreover, we provide useful scripts that could be used for this purpose and a description of key factors that need to be considered in these approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Heffner EL, Sorrells ME, Jannink J (2009) Genomic selection for crop improvement. Crop Sci 49(1):1–12. https://doi.org/10.2135/cropsci2008.08.0512
Article CAS Google Scholar
Crossa J, De Los Campos G, Pérez P et al (2010) Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 186:713–724. https://doi.org/10.1534/genetics.110.118521
Article CAS PubMed PubMed Central Google Scholar
Lorenz AJ, Chao S, Asoro FG et al (2011) Genomic selection in plant breeding. Knowledge and prospects. Adv Agron 110:77–123. https://doi.org/10.1016/B978-0-12-3855312.00002-5
Article Google Scholar
Villano C et al (2018) High-throughput genotyping in onion reveals structure of genetic diversity and informative SNPs useful for molecular breeding. Mol Breed 39(1). https://doi.org/10.1007/s11032-018-0912-0
Collard BC, Mackill DJ (2008) Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Philos Trans R Soc Lond Ser B Biol Sci 363:557–572. https://doi.org/10.1098/rstb.2007.2170
Article CAS Google Scholar
Crossa J, Pérez-Rodríguez P, Cuevas J et al (2017) Genomic selection in plant breeding: methods, models, and perspectives. Trends Plant Sci 22(11):961–975. https://doi.org/10.1016/j.tplants.2017.08.011
Article CAS PubMed Google Scholar
Dekkers JCM, Hospital F (2002) The use of molecular genetics in the improvement of agricultural populations. Nat Rev Genet 3:22–32. https://doi.org/10.1038/nrg701
Article CAS PubMed Google Scholar
Wang X, Xu Y, Hu Z, Xu C (2018) Genomic selection methods for crop improvement: current status and prospects. Crop J 6:330–340. https://doi.org/10.1016/j.cj.2018.03.001
Article Google Scholar
Heffner EL, Sorrells ME, Jannink JL (2009) Genomic selection for crop improvement, vol 49, pp 1–12. https://doi.org/10.2135/cropsci2008.08.0512
Book Google Scholar
Esposito S, Carputo D, Cardi T, Tripodi P (2019) Applications and trends of machine learning in genomics and phenomics for next-generation breeding. Plants 9(1). https://doi.org/10.3390/plants9010034
Van der Auwera GA et al (2013) From FastQ data to high confidence variant calls: the genome analysis toolkit best practices pipeline. Curr Protoc Bioinformatics 43:11.10.1–11.10.33. https://doi.org/10.1002/0471250953.bi1110s43
Article Google Scholar
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17:10–12. https://doi.org/10.14806/ej.17.1.200
Article Google Scholar
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. https://doi.org/10.1093/bioinformatics/btp324
Article CAS PubMed PubMed Central Google Scholar
Li H, Handsaker B, Wysoker A et al (2009) Genome project data processing subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Article CAS PubMed PubMed Central Google Scholar
Picard Toolkit (2018) Broad Institute, GitHub Repository. http://broadinstitute.github.io/picard/
Catchen J, Hohenlohe P, Bassham S et al (2013) Stacks: an analysis tool set for population genomics. Mol Ecol. https://doi.org/10.1111/mec.12354
Purcell S, Neale B, Todd-Brown K et al (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Hum Genet 81:559–575. https://doi.org/10.1086/519795
Article CAS PubMed PubMed Central Google Scholar
Endelman JB (2011) Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 4:250–255. https://doi.org/10.3835/plantgenome2011.08.0024
Article Google Scholar
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. https://doi.org/10.1093/bioinformatics/btu170
Blankenberg D et al (2010) Manipulation of FASTQ data with Galaxy. Bioinformatics 26:1783–1785. https://doi.org/10.1093/bioinformatics/btq281
Article CAS PubMed PubMed Central Google Scholar
Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851–1858. https://doi.org/10.1101/gr.078212.108
Article CAS PubMed PubMed Central Google Scholar
Lunter G, Goodson M (2011) Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21:936–939. https://doi.org/10.1101/gr.111120.110
Article CAS PubMed PubMed Central Google Scholar
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25. https://doi.org/10.1186/gb-2009-10-3-r25
Article CAS PubMed PubMed Central Google Scholar
Li R, Yu C, Li Y et al (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25:1966–1967. https://doi.org/10.1093/bioinformatics/btp336
Article CAS PubMed Google Scholar
McKenna A, Hanna M, Banks E et al (2010) The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303. https://doi.org/10.1101/gr.107524.110
Article CAS PubMed PubMed Central Google Scholar
Li R, Li Y, Kristiansen K, Wang J (2008) SOAP: short oligonucleotide alignment program. Bioinformatics 24:713–714. https://doi.org/10.1093/bioinformatics/btn025
Article CAS PubMed Google Scholar
Wei Z, Wang W, Hu P, Lyon GJ, Hakonarson H (2011) SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data. Nucleic Acids Res 39:e132. https://doi.org/10.1093/nar/gkr599
Article CAS PubMed PubMed Central Google Scholar
Clement NL, Snell Q, Clement MJ et al (2010) The GNUMAP algorithm: unbiased probabilistic mapping of oligonucleotides from next-generation sequencing. Bioinformatics 26:38–45
Article CAS PubMed Google Scholar
Schopp P, Müller D, Technow F, Melchinger AE (2017) Accuracy of genomic prediction in synthetic populations depending on the number of parents, relatedness, and ancestral linkage disequilibrium. Genetics 205:441–454. https://doi.org/10.1534/genetics.116.193243
Article CAS PubMed Google Scholar
Edwards SM, Buntjer JB, Jackson R et al (2019) The effects of training population design on genomic prediction accuracy in wheat. Theor Appl Genet 132:1943–1952. https://doi.org/10.1101/443267
Article CAS PubMed PubMed Central Google Scholar
Bassi FM, Bentley AR, Charmet G, Ortiz R, Crossa J (2016) Breeding schemes for the implementation of genomic selection in wheat (Triticum Spp.). Plant Sci 242:23–36. https://doi.org/10.1016/j.plantsci.2015.08.021
Article CAS PubMed Google Scholar
Zhang H, Yin L, Wang M et al (2019) Factors affecting the accuracy of genomic selection for agricultural economic traits in maize, cattle, and pig populations. Front Genet 10:189. https://doi.org/10.3389/fgene.2019.00189
Article CAS PubMed PubMed Central Google Scholar
Robertsen CD, Hjotrtshøj RL, Janss LL (2019) Genomic selection in cereal breeding. Agronomy 9:1–16. https://doi.org/10.3390/agronomy9020095
Article Google Scholar
Whittaker JC, Thompson R, Denham MC (2000) Marker-assisted selection using ridge regression. Genet Res 75:249–252. https://doi.org/10.1017/S0016672399004462
Article CAS PubMed Google Scholar
Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829
CAS PubMed PubMed Central Google Scholar
De los Campos G, Hickey JM, Pong-Wong R et al (2013) Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193:327–345. https://doi.org/10.1534/genetics.112.143313
Article PubMed Central Google Scholar

Download references

Acknowledgment

The authors thank the BRESOV (Breeding for resilient, efficient and sustainable organic vegetable production) and TomGEM (A holistic multi-actor approach toward the design of new tomato varieties and management practices to improve yield and quality in the face of climate change) projects founded by the European Union Horizon 2020 research and innovation program under grant agreement No. 774244 and No. 679796, respectively. We also thank D’Acunzo D.M. for editing the manuscript.

Author information

Authors and Affiliations

CREA Research Centre for Vegetable and Ornamental Crops, Pontecagnano Faiano, Italy
Felice Contaldi & Salvatore Esposito
Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy
Elisa Cappetta

Authors

Felice Contaldi
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Cappetta
View author publications
You can also search for this author in PubMed Google Scholar
Salvatore Esposito
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Council for Agricultural Research and Economics - Research Centre for Vegetable and Ornamental Crops (CREA-OF), Pontecagnano, Salerno, Italy
Pasquale Tripodi

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Contaldi, F., Cappetta, E., Esposito, S. (2021). Practical Workflow from High-Throughput Genotyping to Genomic Estimated Breeding Values (GEBVs). In: Tripodi, P. (eds) Crop Breeding. Methods in Molecular Biology, vol 2264. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1201-9_9

Download citation

DOI: https://doi.org/10.1007/978-1-0716-1201-9_9
Published: 03 December 2020
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1200-2
Online ISBN: 978-1-0716-1201-9
eBook Packages: Springer Protocols

Publish with us

Policies and ethics