Abstract
Key message
The article presents an optimization of the key parameters for the identification of SNPs in sugarcane using a GBS protocol based on two Illumina NextSeq and NovaSeq platforms.
Abstract
Sugarcane (Saccharum sp.), a world-wide known feedstock for sugar production, bioethanol, and energy, has an extremely complex genome, being highly polyploid and aneuploid. A double-digestion restriction site-associated DNA sequencing protocol (ddRADseq) was tested in four commercial sugarcane hybrids and one high-fibre biotype for the detection of single nucleotide polymorphisms (SNPs). In this work we tested two Illumina sequencing platforms, read size (70 vs. 150 bp), different sequencing coverage per individual (medium and high coverage), and single-reads versus paired-end reads. We also explored different variant calling strategies (with and without reference genome) and filtering schemes [combining two minor allele frequencies (MAFs) with three depth of coverage thresholds]. For the discovery of a large number of novel SNPs in sugarcane, we recommend longer size and paired-end reads, medium sequencing coverage per individual and Illumina platform NovaSeq6000 for a cost-effective approach, and filter parameters of lower MAF and higher depth coverages thresholds. Although the de novo analysis retrieved more SNPs, the reference-based method allows downstream characterization of variants. For the two best performing matrices, the number of SNPs per chromosome correlated positively with chromosome length, demonstrating the presence of variants throughout the genome. Multivariate comparisons, with both matrices, showed closer relationships among commercial hybrids than with the high-fibre biotype. Functional analysis of the SNPs demonstrated that more than half of them landed within regulatory regions, whereas the other half affected coding, intergenic and intronic regions. Allelic distances values were lower than 0.07 when analysing two replicated genotypes, confirming the protocol robustness.
Similar content being viewed by others
Data availability
The datasets generated and analysed during the current study are not publicly available because they are part of CM Doctoral Thesis but they are available from the corresponding author on reasonable request.
References
Aballay MM, Aguirre NC, Filippi CV, Valentini GH, Sánchez G (2021) Fine-tuning the performance of ddRAD-seq in the peach genome. Sci Rep 111(11):1–13. https://doi.org/10.1038/s41598-021-85815-0
Acevedo A, Tejedor MT, Erazzú LE, Cabada S, Sopena R (2017) Pedigree comparison highlights genetic similarities and potential industrial values of sugarcane cultivars. Euphytica. https://doi.org/10.1007/s10681-017-1908-2
Aguirre NC, Filippi CV, Zaina G, Rivas JG, Acuña CV, Villalba PV, García MN, González S, Rivarola M, Martínez MC, Puebla AF, Morgante M, Hopp HE, Paniego NB, Poltri SNM (2019) Optimizing ddRADseq in non-model species: a case study in Eucalyptus dunnii Maiden. Agronomy. https://doi.org/10.3390/agronomy9090484
Aitken KS (2021) History and development of molecular markers for sugarcane breeding. Sugar Tech 24:341–353. https://doi.org/10.1007/S12355-021-01000-7
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc
Asnaghi C, Roques D, Ruffel S, Kaye C, Hoarau JY, Télismart H, Girard JC, Raboin LM, Risterucci AM, Grivet L, D’Hont A (2004) Targeted mapping of a sugarcane rust resistance gene (Bru1) using bulked segregant analysis and AFLP markers. Theor Appl Genet 108:759–764. https://doi.org/10.1007/S00122-003-1487-6
Balsalobre TWA, da Silva Pereira G, Margarido GRA, Gazaffi R, Barreto FZ, Anoni CO, Cardoso-Silva CB, Costa EA, Mancini MC, Hoffmann HP, de Souza AP, Garcia AAF, Carneiro MS (2017) GBS-based single dosage markers for linkage and QTL mapping allow gene mining for yield-related traits in sugarcane. BMC Genomics 18:1–19. https://doi.org/10.1186/s12864-016-3383-x
Benedetti P (2018) Primer relevamiento del cultivo de caña de azúcar de la República Argentina a partir de imágenes satelitales para la campaña 2018. INTA, pp 2–6
Borcard D, François G, Legendre P (2018) Numerical Ecology with R, 2 nd. Springer International Publishing, Cham, Switzerland
Bottcher A, Cesarino I, dos Santos AB, Vicentini R, Sampaio Mayer JL, Vanholme R, Morreel K, Goeminne G, Magalhães Silva Moura JC, Nobile PM, Carmello-Guerreiro SM, dos Anjos IA, Creste S, Boerjan W, de Andrade Landell MG, Mazzafera P (2013) Lignification in sugarcane: biochemical characterization, gene discovery, and expression analysis in two genotypes contrasting for lignin content. Plant Physiol 163:1539–1557. https://doi.org/10.1104/pp.113.225250
Bourke PM, Voorrips RE, Visser RGF, Maliepaard C (2018) Tools for genetic studies in experimental populations of polyploids. Front Plant Sci 9:1–17. https://doi.org/10.3389/fpls.2018.00513
Bradbury P. J., Zhang Z., Kroon D. E., Casstevens T. M., Ramdoss Y., Buckler E. S. (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinform 23(19):2633–2635. https://doi.org/10.1093/bioinformatics/btm308
Catchen JM, Amores A, Hohenlohe P, Cresko W, Postlethwait JH (2011) Stacks: building and genotyping loci de novo from short-read sequences. Genes Genomes Genet 1:171–182. https://doi.org/10.1534/G3.111.000240
Catchen J, Hohenlohe PA, Bassham S, Amores A, Cresko WA (2013) Stacks: an analysis tool set for population genomics. Mol Ecol 22:3124–3140. https://doi.org/10.1111/mec.12354
Clevenger JP, Ozias-Akins P (2015) SWEEP: a tool for filtering high-quality SNPs in polyploid crops. Genes Genomes Genet 5(9):1797–1803. https://doi.org/10.1534/g3.115.019703
Costet L, Le Cunff L, Royaert S, Raboin LM, Hervouet C, Toubi L, Telismart H, Garsmeur O, Rousselle Y, Pauquet J, Nibouche S, Glaszmann JC, Hoarau JY, D’Hont A (2012) Haplotype structure around Bru1 reveals a narrow genetic basis for brown rust resistance in modern sugarcane cultivars. Theor Appl Genet 125:825–836. https://doi.org/10.1007/S00122-012-1875-X
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158. https://doi.org/10.1093/bioinformatics/btr330
Ewels P, Magnusson M, Lundin S, Käller M (2016) MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32:3047–3048. https://doi.org/10.1093/BIOINFORMATICS/BTW354
Ewing B, Green P (1998) Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res 8(3):186–194
Fickett N, Gutierrez A, Verma M, Pontif M, Hale A, Kimbeng C, Baisakh N (2018) Genome-wide association mapping identifies markers associated with cane yield components and sucrose traits in the Louisiana sugarcane core collection. Genomics. https://doi.org/10.1016/j.ygeno.2018.12.002
Garcia AAF, Mollinari M, Marconi TG, Serang OR, Silva RR, Vieira MLC, Vicentini R, Costa EA, Mancini MC, Garcia MOS, Pastina MM, Gazaffi R, Martins ERF, Dahmer N, Sforça DA, Silva CBC, Bundock P, Henry RJ, Souza GM, Van Sluys MA, Landell MGA, Carneiro MS, Vincentz MAG, Pinto LR, Vencovsky R, Souza AP (2013) SNP genotyping allows an in-depth characterisation of the genome of sugarcane and other complex autopolyploids. Sci Rep 3:1–10. https://doi.org/10.1038/srep03399
García, JM, Molina C, Acevedo A, Silva M, Gómez LD, Erazzú LE (2020) Caña de azúcar en Argentina: situación actual y uso para fines bioenergéticos. In: Optimización de los procesos de extracción de biomasa sólida para uso energético. UP Valencia, España, pp 145–158
García JM, Silva MP, Simister R, McQueen-Mason SJ, Erazzú LE, Gómez LD, Acevedo A (2021) High variability for cell-wall and yield components in commercial sugarcane (Saccharum spp.) progeny contrasts with parental lines and energy cane. J Crop Improv. https://doi.org/10.1080/15427528.2021.2011521
García José M., Silva Mariana P., Simister Rachael, McQueen-Mason Simon J., Erazzú Luis E., Gomez Leonardo D., Acevedo Alberto (2022) Variability for cell-wall and yield components in commercial sugarcane (Saccharum spp.) progeny: contrasts with parental lines and energy cane. J of Crop Improv 36(6):769–788. https://doi.org/10.1080/15427528.2021.2011521
Garsmeur O, Droc G, Antonise R, Grimwood J, Potier B, Aitken K, Jenkins J, Martin G, Charron C, Hervouet C, Costet L, Yahiaoui N, Healey A, Sims D, Cherukuri Y, Sreedasyam A, Kilian A, Chan A, Van Sluys MA, Swaminathan K, Town C, Bergès H, Simmons B, Glaszmann JC, Van Der Vossen E, Henry R, Schmutz J, D’Hont A (2018) A mosaic monoploid reference sequence for the highly complex genome of sugarcane. Nat Commun. https://doi.org/10.1038/s41467-018-05051-5
Glyn J, Rod E, Merry B, Yates D, Yates W, Yates B, Digges P, Forber G, Todd M, Berding N, Cox M, Hogarth M, Bailey R, Leslie G, Irvin J (2004) Sugarcane, 2nd edn. Blackwell Publishing Company, Oxford
Gutiérrez AV, Filippi CV, Aguirre NC, Puebla AF, Acuña CV, Taboada GM, Ortega-Baes FP (2021) Development of novel SSR molecular markers using a Next-Generation Sequencing approach (ddRADseq) in Stetsonia coryne (Cactaceae). An Acad Bras Cienc. https://doi.org/10.1590/0001-3765202120201778
INDEC: Instituto Nacional de Estadísticas y Censos (2018) Resultados definitivos - Censo Nacional Agropecuario. INDEC, Buenos Aires. https://www.indec.gob.ar/ftp/cuadros/economia/cna2018_resultados_definitivos.pdf. Accessed 21 Dec 2021
Kane AO, Pellergini VOA, Espirito Santo MC, Ngom BD, García JM, Acevedo A, Erazzú LE, Polikarpov I (2021) Evaluating the potential of culms from sugarcane and energy cane varieties grown in Argentina for second-generation ethanol production. Waste Biomass Valoriz 2021:1–15. https://doi.org/10.1007/S12649-021-01528-5
Knaus BJ, Grünwald NJ (2017). VCFR: a package to manipulate and visualize variant call format data in R. Mol Ecol Resour 17(1):44–53. ISSN 757. https://doi.org/10.1111/1755-0998.12549
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. https://doi.org/10.1038/nmeth.1923
López de Heredia U (2016) Las técnicas de secuenciación masiva en el estudio de la diversidad biológica. Munibe Cienc Nat. https://doi.org/10.21630/mcn.2016.64.07
Mahadevaiah C, Appunu C, Aitken K, Suresha GS, Vignesh P, Mahadeva Swamy HK, Valarmathi R, Hemaprabha G, Alagarasan G, Ram B (2021) Genomic selection in sugarcane: current status and future prospects. Front Plant Sci 12:2019. https://doi.org/10.3389/FPLS.2021.708233/BIBTEX
Manimekalai R, Suresh G, Govinda Kurup H, Athiappan S, Kandalam M (2020) Role of NGS and SNP genotyping methods in sugarcane improvement programs. Crit Rev Biotechnol 40:865–880. https://doi.org/10.1080/07388551.2020.1765730
McCormick RF, Truong SK, Sreedasyam A, Jenkins J, Shu S, Sims D, Kennedy M, Amirebrahimi M, Weers BD, McKinley B, Mattison A, Morishige DT, Grimwood J, Schmutz J, Mullet JE (2018) The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization. Plant J 93:338–354. https://doi.org/10.1111/tpj.13781
Medeiros C, Almeida Balsalobre TW, Carneiro MS (2020) Molecular diversity and genetic structure of Saccharum complex accessions. PLoS ONE 15:e0233211. https://doi.org/10.1371/JOURNAL.PONE.0233211
Ostengo S, Serino G, Perera MF et al (2021) Sugarcane breeding, germplasm development and supporting genetic research in Argentina. Sugar Tech. https://doi.org/10.1007/s12355-021-00999-z
Paradis E (2010) pegas: an R package for population genetics with an integrated-modular approach. Bioinformatics 26:419–420
Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE (2012) Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS ONE. https://doi.org/10.1371/journal.pone.0037135
Ravinet M, Meier J (2020) Speciation and Population Genomics: a how-to-guide. Physalia Courses. February 2020. https://speciationgenomics.github.io/filtering_vcfs/
R Core Team (2019). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/
Rochette NC, Catchen JM (2017) Deriving genotypes from RAD-seq short-read data using Stacks. Nat Protoc 12:2640–2659. https://doi.org/10.1038/nprot.2017.123
Serang O, Mollinari M, Garcia AAF (2012) Efficient exact maximum a posteriori computation for Bayesian SNP genotyping in polyploids. PLoS ONE 7:1–13. https://doi.org/10.1371/journal.pone.0030906
Thirugnanasambandam PP, Hoang NV, Henry RJ (2018) The challenge of analyzing the sugarcane genome. Front Plant Sci 9:616. https://doi.org/10.3389/FPLS.2018.00616/BIBTEX
Trujillo-Montenegro JH, Rodríguez Cubillos MJ, Loaiza CD, Quintero M, Espitia-Navarro HF, Salazar Villareal FA, Viveros Valens CA, González Barrios AF, De Vega J, Duitama J, Riascos JJ (2021) Unraveling the genome of a high yielding Colombian sugarcane hybrid. Front Plant Sci. https://doi.org/10.3389/FPLS.2021.694859
Vieira MLC, Almeida CB, Oliveira CA, Tacuatiá LO, Munhoz CF, Cauz-Santos LA, Pinto LR, Monteiro-Vitorello CB, Xavier MA, Forni-Martins ER (2018) Revisiting meiosis in sugarcane: chromosomal irregularities and the prevalence of bivalent configurations. Front Genet 9:1–12. https://doi.org/10.3389/fgene.2018.00213
Walton J (2020) The 5 countries that produce the most sugar. https://www.investopedia.com/articles/investing/101615/5-countries-produce-most-sugar.asp. Accessed 22 Jan 2021
Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer, New York
Yadav S, Wei X, Joyce P, Atkin F, Deomano E, Sun Y, Nguyen LT, Ross EM, Cavallaro T, Aitken KS, Hayes BJ, Voss-Fels KP (2021) Improved genomic prediction of clonal performance in sugarcane by exploiting non-additive genetic effects. Theor Appl Genet 134:2235–2252. https://doi.org/10.1007/S00122-021-03822-1/TABLES/4
Yang X, Song J, You Q, Paudel DR, Zhang J, Wang J (2017) Mining sequence variations in representative polyploid sugarcane germplasm accessions. BMC Genomics 18:1–16. https://doi.org/10.1186/s12864-017-3980-3
Yang X, Kandel R, Song J, You Q, Wang M, Wang J (2018) Sugarcane genome sequencing and genetic mapping. 3–34. https://doi.org/10.19103/as.2017.0035.02
Yates AD, Allen J, Amode RM, Azov AG, Barba M, Becerra A, Bhai J, Campbell LI, Carbajo Martinez M, Chakiachvili M, Chougule K, Christensen M, Contreras-Moreira B, Cuzick A, Da Rin FL, Davis P, De Silva NH, Diamantakis S, Dyer S, Elser J, Filippi CV, Gall A, Grigoriadis D, Guijarro-Clarke C, Gupta P, Hammond-Kosack KE, Howe KL, Jaiswal P, Kaikala V, Kumar V, Kumari S, Langridge N, Le T, Luypaert M, Maslen GL, Maurel T, Moore B, Muffato M, Mushtaq A, Naamati G, Naithani S, Olson A, Parker A, Paulini M, Pedro H, Perry E, Preece J, Quinton-Tulloch M, Rodgers F, Rosello M, Ruffier M, Seager J, Sitnik V, Szpak M, Tate J, Tello-Ruiz MK, Trevanion SJ, Urban M, Ware D, Wei S, Williams G, Winterbottom A, Zarowiecki M, Finn RD, Flicek P (2022) Ensembl Genomes 2022: an expanding genome resource for non-vertebrates. Nucleic Acids Res 50:D996–D1003. https://doi.org/10.1093/NAR/GKAB1007
Yin L, Zhang H, Tang Z, Xu J, Yin D, Zhang Z, Yuan X, Zhu M, Zhao S, Li X, Liu X (2021) rMVP: a memory-efficient, visualization-enhanced, and parallel-accelerated tool for genome-wide association study. Genomics Proteomics Bioinform. https://doi.org/10.1016/j.gpb.2020.10.007 (Epub ahead of print)
You Q, Yang X, Peng Z, Islam MS, Sood S, Luo Z, Comstock J, Xu L, Wang J (2019) Development of an Axiom Sugarcane100K SNP array for genetic map construction and QTL identification. Theor Appl Genet 132:2829–2845. https://doi.org/10.1007/S00122-019-03391-4/FIGURES/4
Zhang Q, Qi Y, Pan H, Tang H, Wang G, Hua X, Wang Y, Lin L, Li Z, Li Y, Yu F, Yu Z, Huang Y, Wang T, Ma P, Dou M, Sun Z, Wang Y, Wang H, Zhang X, Yao W, Wang Y, Liu X, Wang M, Wang J, Deng Z, Xu J, Yang Q, Liu Z, Chen B, Zhang M, Ming R, Zhang J (2022) Genomic insights into the recent chromosome reduction of autopolyploid sugarcane Saccharum spontaneum. Nat Genet 546(54):885–896. https://doi.org/10.1038/s41588-022-01084-1
Acknowledgements
We specially thank Sergio Gonzalez (Instituto de Agrobiotecnología y Biología Molecular, Instituto Nacional de Tecnología Agropecuaria, INTA-Consejo Nacional de Investigaciones Científicas y Técnicas, CONICET, Argentina), for the assistance and advice on bioinformatics and Giusi Zaina (Department of Agricultural, Food, Environmental and Animal Sciences, University of Udine, Italy) for improving the quality of the manuscript through her critical reading. We are also grateful to Agencia Nacional de Promoción Científica y Técnica and CONICET for supporting a Doctoral Fellowship to Catalina Molina, and the Unidad de Genómica, INTA, Argentina for laboratory collaboration. Last but not least, we thank Luis Erazzú and José M. García for providing the sugarcane hybrids and the field team of INTA’s sugarcane breeding program for its technical assistance.
Funding
The study was funded by Grants from Agencia Nacional de Promoción Científica y Técnica (PICT 2016 N° 1670), and INTA (PE N° I114 and PE N° 516).
Author information
Authors and Affiliations
Contributions
AA, NBP, SNMP and AFP contributed to the study conception and design. Harvesting and maintenance of plant material were performed by CM. DNA extraction was done by CM. ddRADseq libraries were performed by CM, PAV and AFP. Sequenced data were analysed by CM, CVF and NCA. Funding acquisition by AA, NBP and SNMP. Project administration by AA. Supervision by AA, NBP, SNMP, AFP, CVF and NCA. Writing—original draft preparation by CM and AA. Writing—review and editing by CM, AA, NBP, SNMP, CVF, NCA. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Molina, C., Aguirre, N.C., Vera, P.A. et al. ddRADseq-mediated detection of genetic variants in sugarcane. Plant Mol Biol 111, 205–219 (2023). https://doi.org/10.1007/s11103-022-01322-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11103-022-01322-4