Abstract
Key message
New software to make tetraploid genotype calls from SNP array data was developed, which uses hierarchical clustering and multiple F1 populations to calibrate the relationship between signal intensity and allele dosage.
Abstract
SNP arrays are transforming breeding and genetics research for autotetraploids. To fully utilize these arrays, the relationship between signal intensity and allele dosage must be calibrated for each marker. We developed an improved computational method to automate this process, which is provided as the R package ClusterCall. In the training phase of the algorithm, hierarchical clustering within an F1 population is used to group samples with similar intensity values, and allele dosages are assigned to clusters based on expected segregation ratios. In the prediction phase, multiple F1 populations and the prediction set are clustered together, and the genotype for each cluster is the mode of the training set samples. A concordance metric, defined as the proportion of training set samples equal to the mode, can be used to eliminate unreliable markers and compare different algorithms. Across three potato families genotyped with an 8K SNP array, ClusterCall scored 5729 markers with at least 0.95 concordance (94.6% of its total), compared to 5325 with the software fitTetra (82.5% of its total). The three families were used to predict genotypes for 5218 SNPs in the SolCAP diversity panel, compared with 3521 SNPs in a previous study in which genotypes were called manually. One of the additional markers produced a significant association for vine maturity near a well-known causal locus on chromosome 5. In conclusion, when multiple F1 populations are available, ClusterCall is an efficient method for accurate, autotetraploid genotype calling that enables the use of SNP data for research and plant breeding.
References
Bourke PM, Voorrips RE, Visser RGF, Maliepaard C (2015) The double reduction landscape in tetraploid potato as revealed by a high-density linkage map. Genetics 201:853–863. doi:10.1534/genetics.115.181008
Bradshaw JE, Hackett CA, Pande B, Waugh R, Bryan GJ (2008) QTL mapping of yield, agronomic and quality traits in tetraploid potato (Solanum tuberosum subsp. tuberosum). Theor Appl Genet 116:193–211
Brouwer DJ, Osborn TC (1999) A molecular marker linkage map of tetraploid alfalfa (Medicago sativa L.). Theor Appl Genet 99:1194–1200. doi:10.1007/s001220051324
Comai L (2005) The advantages and disadvantages of being polyploid. Nat Rev Genet 6:836–846. doi:10.1038/nrg1711
Douches D, Hirsch CN, Manrique-Carpintero NC, Massa AN, Coombs J, Hardigan M, Bisognin D, De Jong W, Buell CR (2014) The contribution of the Solanaceae coordinated agricultural project to potato breeding. Potato Res 57(3–4):215–224. doi:10.1007/s11540-014-9267-z
Endelman J, Jansky S (2016) Genetic mapping with an inbred line derived F2 population in potato. Theor Appl Genet 129:935–943. doi:10.1007/s00122-016-2673-7
Felcher KJ, Coombs JJ, Massa AN, Hansey CN, Hamilton JP, Veilleux RE, Buell CB, Douches DS (2012) Integration of two diploid potato linkage maps with the potato genome sequence. PLoS One 7(4):e36347. doi:10.1371/journal.pone.0036347
Gallais A (2003) Quantitative genetics and breeding methods in autopolyploid plants. INRA, Paris
Hackett CA, McLean K, Bryan GJ (2013) Linkage analysis and QTL mapping using SNP dosage data in a tetraploid potato mapping population. PLoS One 8(5):e63939. doi:10.1371/journal.pone.0063939
Hamilton JP, Hansey CN, Whitty BR, Stoffel K, Massa AN, Van Deynze A, De Jong WS, Douches DS, Buell CR (2011) Single nucleotide polymorphism discovery in elite North American potato germplasm. BMC Genom 12:302. doi:10.1186/1471-2164-12-302
Hirsch CN, Hirsch CD, Felcher K, Coombs J, Zarka D, Van Deynze A, De Jong W, Veilleux RE, Jansky S, Bethke P, Douches DS, Buell CR (2013) Retrospective view of North American potato (Solanum tuberosum L.) breeding in the 20th and 21st centuries. G3 3:1003–1013. doi:10.1534/g3.113.005595
Jones GH, Khazanehdari KA, Ford-Lloyd BV (1996) Meiosis in the leek (Allium porrum L.) revisited. II. Metaphase I observations. Heredity 76:186–191
Kloosterman B, Abelenda JA, Carretero Gomez MM, Oortwijn M, de Boer JM, Kowitwanich K, Horvath BM, van Eck HJ, Smaczniak C, Prat S, Visser RGF, Bachem CWB (2013) Naturally occurring allele diversity allows potato cultivation in northern latitudes. Nature 495:246–250. doi:10.1038/nature11912
Koning-Boucoiran CFS, Esselink GD, Vukosavljev M, van’t Westende WPC, Gitonga VW, Krens FA, Voorrips RE, van de Weg WE, Schulz D, Debener T, Maliepaard C, Arens P, Smulders MJM (2015) Using RNA-seq to assemble a rose transcriptom with more than 13,000 full-length expressed genes and to develop the WagRhSNP 68k Axiom SNP array for rose (Rose L.). Front Plant Sci 6:249. doi:10.3389/fpls.2015.00249
Krebs SL, Hancock JF (1989) Tetrasomic inheritance of isoenzyme markers in the highbush blueberry, Vaccinium corymbosum L. Heredity 63:11–18. doi:10.1038/hdy.1989.70
Leal-Bertioli S, Shirasawa K, Abernathy B, Moretzsohn M, Chavarro C, Clevenger J, Ozias-Akins P, Jackson S, Bertioli D (2015) Tetrasomic recombination is surprisingly frequent in allotetraploid Arachis. Genetics 199:1093–1105. doi:10.1534/genetics.115.174607
Leitch IJ, Bennett MD (1997) Polyploidy in angiosperms. Trends Plant Sci 2:470–476. doi:10.1016/S1360-1385(97)01154-0
Li X, van Eck HJ, Rouppe van der Voort JNAM, Huigen DJ, Stam P, Jacobsen E (1998) Autotetraploids and genetic mapping using common AFLP markers: the R2 allele conferring resistance to Phytophthora intestans mapped on potato chromosome 4. Theor Appl Genet 96:1121–1128. doi:10.1007/s001220050847
Li X, De Jong H, De Jong DM, De Jong WS (2005) Inheritance and genetic mapping of tuber eye depth in cultivated diploid potatoes. Theor Appl Genet 110:1068–1073. doi:10.1007/s00122-005-1927-6
Li X, Han Y, Wei Y, Acharya A, Farmer AD, Ho J, Monteros MJ, Brummer EC (2014a) Development of an alfalfa SNP array and its use to evaluate patterns of population structure and linkage disequilibrium. Plos One. doi:10.1371/journal.pone.0084329
Li X, Wei Y, Acharya A, Jiang Q, Kang J, Brummer EC (2014b) A saturated genetic linkage map of autotetraploid alfalfa (Medicago sativa L.) developed using genotyping-by-sequencing is highly syntenous with the Medicago truncatula genome. G3 4:1971–1979. doi:10.1534/g3.114.012245
Luo ZW, Hackett CA, Bradshaw JE, McNicol JW, Milbourne D (2001) Construction of a genetic linkage map in tetraploid species using molecular markers. Genetics 157:1369–1385
Mather K (1936) Segregation and linkage in autotetraploids. J Genet 32(2):287–314. doi:10.1007/BF02982683
Potato Genome Sequencing Consortium (2011) Genome sequence and analysis of the tuber crop potato. Nature 475:189–195. doi:10.1038/nature10158
Quiros CF (1982) Tetrasomic segregation for multiple alleles in alfalfa. Genetics 101:117–127
R Development Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Renny-Byfield S, Wendel FJ (2014) Doubling down on genomes: polyploidy and crop plants. Am J Bot 101:1711–1725. doi:10.3732/ajb.1400119
Rosyara UR, De Jong WS, Douches DS, Endelman JB (2016) Software for genome-wide association studies in autopolyploids and its application to potato. Plant Genome 9. doi:10.3835/plantgenome2015.08.0073
Serang O, Mollinari M, Garcia AAF (2012) Efficient exact maximum a posteriori computation for Bayesian SNP genotyping in polyploids. PLoS One 7:e30906
Sharma SK, Bolser D, de Boer J, Sønderkær M, Amoros W, Carboni MF, D’Ambrosio JM, de la Cruz G, Di Genova A, Douches DS, Eguiluz M, Guo X, Guzman F, Hackett CA, Hamilton JP, Li G, Li Y, Lozano R, Maass A, Marshall D, Martinez D, McLean K, Mejía N, Milne L, Munive S, Nagy I, Ponce O, Ramirez M, Simon R, Thomson SJ, Torres Y, Waugh R, Zhang Z, Huang S, Visser RGF, Bachem CWB, Sagredo B, Feingold SE, Orjeda G, Veilleux RE, Bonierbale M, Jacobs JME, Milbourne D, Martin DMA, Bryan GJ (2013) Construction of reference chromosome-scale pseudomolecules for potato: integrating the potato genome with genetic and physical maps. G3 3:2031–2047. doi:10.1534/g3.113.007153
Simko I, Haynes KG, Jones RW (2006) Assessment of linkage disequilibrium in potato genome with single nucleotide polymorphism markers. Genetics 173:2237–2245. doi:10.1534/genetics.106.060905
Stebbins GL (1950) Variation and evolution in plants. Columbia University Press, New York
Uitdewilligen JGAML, Wolters AMA, D’hoop BB, Borm TJA, Visser RGF, van Eck HJ (2013) A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato. PLoS One 8:e62355
van Eck HJ, Jacobs JM, Stam P, Ton J, Stiekema WJ, Jacobsen E (1994) Multiple alleles for tuber shape in diploid potato detected by qualitative and quantitative genetic analysis using RFLPs. Genetics 137:303–309
Venables WN, Ripley BD (2002) Modern Applied Statistics with S. 4th edition. Springer, New York.
Voorrips RE, Gort G, Vosman B (2011) Genotype calling in tetraploid species from bi-allelic marker data using mixture models. BMC Bioinform 12:172. doi:10.1186/1471-2105-12-172
Vos PG, Uitdewilligen JGAML, Voorrips RE, Visser RGF, van Eck HJ (2015) Development and analysis of a 20 K SNP array for potato (Solanum tuberosum): an insight into the breeding history. Theor Appl Genet 128:2387–2401. doi:10.1007/s00122-015-2593-y
Wu KK, Burnquist W, Sorrells ME, Tew TL, Moore PH, Tanksley SD (1992) The detection and estimation of linkage in polyploids using single-dose restriction fragments. Theor Appl Genet 83:294–300. doi:10.1007/BF00224274
Zheng C, Voorrips RE, Jansen J, Hackett CA, Ho J, Bink MCAM (2016) Probabilistic multilocus haplotype reconstruction in outcrossing tetraploids. Genetics 203:119–131. doi:10.1534/genetics.115.185579
Zorrilla C, Navarro F, Vega S, Bamberg J, Palta J (2014) Identification and selection for tuber calcium, internal quality and pitted scab in segregating ‘Atlantic’ × ‘Superior’ reciprocal tetraploid populations. Am J Potato Res 91:673–687. doi:10.1007/s12230-014-9399-3
Acknowledgements
Financial support was provided by the National Institute of Food and Agriculture, U.S. Department of Agriculture, Award Number 2014-67013-22418 and Hatch Project Number 1002731.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by Christine A. Hackett.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Schmitz Carley, C.A., Coombs, J.J., Douches, D.S. et al. Automated tetraploid genotype calling by hierarchical clustering. Theor Appl Genet 130, 717–726 (2017). https://doi.org/10.1007/s00122-016-2845-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00122-016-2845-5