Abstract
We tested the ability of the self-organizing map (SOM), a type of artificial neural network, in revealing genetic patterns within the autotetraploid potato (Solanum tuberosum L.). A total of 591 potato varieties, originating from various main European breeder collections and released into different market segments between 1815 and 2015, were examined using a set of 21 informative microsatellite markers. The consistency of this artificial intelligence approach in detecting genetic stratifications in such a homogeneous population was evaluated through the comparison with three other multivariate methods that are widely used for this purpose. Results showed that the SOM was equally suitable for classifying varieties into main detected groups and visualizing inter-group genetic dissimilarities. When it came to reveal the organization of the population structure at the intra-group level, traditional multivariate methods lost in resolution. Contrariwise, the SOM provided additional information on the intra-group diversity by highlighting a multitude of consistent subgroups, which seemed to be mainly related to their common heritage, spatio-temporal features and certain agronomic traits. Relations between computed SOM subgroups and the market segments were subject to certain elucidations. The relevance of using more flexible multivariate statistical approaches for mapping population structures of crop species is considered throughout this paper in terms of current and future prospects for breeding programs.
Similar content being viewed by others
References
Ames M, Spooner DM (2008) DNA from herbarium specimens settles a controversy about origins of the European potato. Am J Bot 95:252–257. https://doi.org/10.3732/ajb.95.2.252
Aurelle D, Lek S, Giraudel J-L, Berrebi P (1999) Microsatellites and artificial neural networks: tools for the discrimination between natural and hatchery brown trout (Salmo trutta, L.) in Atlantic populations. Ecol Model 120:313–324. https://doi.org/10.1016/S0304-3800(99)00111-8
Blayo F, Demartines P (1991) Data analysis: How to compare Kohonen neural networks to other techniques? In: Prieto A. (eds) Artificial neural networks. IWANN 1991. Lecture Notes in Computer Science, vol 540. Springer, Berlin. https://doi.org/10.1007/BFb0035929
Bouzid W, Lek S, Mace M, Hassine OB, Etienne R, Legal L, Loot G (2008) Genetic diversity of Ligula intestinalis (L.) (Cestoda: Diphyllobothriidea) based on analysis of inter-simple sequence repeat markers. J Zool Syst Evol Res 46:289–296. https://doi.org/10.1111/j.1439-0469.2008.00471.x
Choi KH, Kim JS, Kim YS, Yoo MA, Chon TS (2006) Pattern detection of movement behaviors in genotype variation of Drosophila melanogaster by using self-organizing map. Ecol Inform 1:219–228. https://doi.org/10.1016/j.ecoinf.2005.12.002
Chon TS (2011) Self-Organizing Maps applied to ecological sciences. Ecol Inf 6:50–61. https://doi.org/10.1016/j.ecoinf.2010.11.002
Chon TS, Park YS, Moon KH, Cha EY (1996) Patternizing communities by using an artificial neural network. Ecol Model 90:69–78. https://doi.org/10.1016/0304-3800(95)00148-4
Collard BCY, Mackill DJ (2008) Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Phil Trans R Soc B 363:557–572. https://doi.org/10.1098/rstb.2007.2170
D’hoop BB, Paulo MJ, Kowitwanich K, Sengers M, Visser RGF, van Eck HJ, van Eeuwijk FA (2010) Population structure and linkage disequilibrium unravelled in tetraploid potato. Theor Appl Genet 121:1151–1170. https://doi.org/10.1007/s00122-010-1379-5
D’hoop BB, Paulo MJ, Mank RA, Van Eck HJ, Van Eeuwijk FA (2008) Association mapping of quality traits in potato (Solanum tuberosum L.). Euphytica 161:47–60. https://doi.org/10.1007/s10681-007-9565-5
R Development Core Team (2010). R: A language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. ISBN 3-900051-07-0, https://www.R-project.org
Earl DA, von Holdt BM (2012) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour 4:359–361. https://doi.org/10.1007/s12686-011-9548-7
Ellenby C (1952) Resistance to the potato root Eelworm, Heterodera rostochiensis Wollenweber. Nature 170:1016. https://doi.org/10.1038/1701016a0
Engelhardt BE, Stephens M (2010) Analysis of population structure: a unifying framework and novel methods based on sparse factor analysis. PLoS Genet 6(9):e1001117. https://doi.org/10.1371/journal.pgen.1001117
Esnault F, Solano J, Perretant MR, Hervé M, Label A, Pellé R, Dantec JP, Boutet G, Brabant P, Chauvin JE (2014) Genetic diversity analysis of a potato (Solanum tuberosum L.) collection including Chiloé Island landraces and a large panel of worldwide cultivars. Plant Genet Resour 12:74–82. https://doi.org/10.1017/S1479262113000300
Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software structure: a simulation study. Mol Ecol 14:2611–2620. https://doi.org/10.1111/j.1365-294X.2005.02553.x
FAOSTAT (2016) Food and Agriculture Organization of the United Nations Statistics Division. https://faostat3.fao.org/browse/Q/QC/E. Accessed May 2017
Feingold S, Lloyd J, Norero N, Bonierbale M, Lorenzen J (2005) Mapping and characterization of new ESTderived microsatellites for potato (Solanum tuberosum L.). Theor Appl Genet 111:456–466. https://doi.org/10.1007/s00122-005-2028-2
Fischer M, Schreiber L, Colby T, Kuckenberg M, Tacke E, Hofferbert H-R, Schmidt J, Gebhardt C (2013) Novel candidate genes influencing natural variation in potato tuber cold sweetening identified by comparative proteomics and association mapping. BMC Plant Biol 13:1. https://doi.org/10.1186/1471-2229-13-113
Flint-Garcia SA, Thornsberry JM, Buckler ES (2003) Structure of linkage disequilibrium in plants. Annu Rev Plant Biol 54:357–374. https://doi.org/10.1146/annurev.arplant.54.031902.134907
Gebhardt C, Ballvora A, Walkemeier B, Oberhagemann P, Schuler K (2004) Assessing genetic potential in germplasm collections of crop plants by marker-trait association: a case study for potatoes with quantitative variation of resistance to late blight and maturity type. Mol Breed 13:93–102. https://doi.org/10.1023/B:MOLB.0000012878.89855.df
Ghislain M, Spooner DM, Rodriguez F, Villamon F, Nunez J, Vasquez C, Waugh R, Bonierbale M (2004) Selection of highly informative and user-friendly microsatellites (SSRs) for genotyping of cultivated potato. Theor Appl Genet 108:881–890. https://doi.org/10.1007/s00122-003-1494-7
Ghislain M, Nunez J, Ma RH, Pignataro J, Guzman F, Bonierbale M, Spooner D (2009) Robust and highly informative microsatellite-based genetic identity kit for potato. Mol Breed 23:377–388. https://doi.org/10.1007/s11032-008-9240-0
Giraudel JL, Lek S (2001) A comparison of self-organizing map algorithm and some conventional statistical methods for ecological community ordination. Ecol Model 146:329–339. https://doi.org/10.1016/S0304-3800(01)00324-6
Giraudel JL, Aurelle D, Berrebi P, Lek S (2000) Application of the self-organizing mapping and fuzzy clustering to microsatellite data: how to detect genetic structure in brown trout (Salmo trutta) populations. In: Lek S, Guégan JF (eds.), Artificial neuronal networks, application to ecology and evolution. Springer, Berlin, pp. 187–202. https://doi.org/10.1007/978-3-642-57030-8_13
Henegariu O, Heerema NA, Dlouhy SR, Vance GH, Vogt PH (1997) Multiplex PCR: critical parameters and step-by-step protocol. Biotechniques 23:504–511. https://doi.org/10.2144/97233rr01
Hill MO, Gauch HG (1980) Detrended correspondence analysis, an improved ordination technique. Vegetatio 42:47–58. https://doi.org/10.1007/BF00048870
Hirsch CN, Hirsch CD, Felcher K, Coombs J, Zarka D, Van Deynze A, De Jong W, Veilleux RE, Jansky S, Bethke P, Douches DS, Buell CR (2013) Retrospective view of North American Potato (Solanum tuberosum L.) Breeding in the 20th and 21st Centuries. G3: Genes Genomes Genet 3:1003–1013. https://doi.org/10.1534/g3.113.005595
Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23:1801–1806. https://doi.org/10.1093/bioinformatics/btm233
Karaagac E, Yilma S, Cuesta-Marcos A, Vales MI (2014) Molecular analysis of potatoes from the Pacific northwest tri-state variety development program and selection of markers for practical DNA fingerprinting applications. Am J Potato Res 91:95–203. https://doi.org/10.1007/s12230-013-9338-8
Kenkel NC, Orloci L (1986) Applying metric and nonmetric multidimensional scaling to ecological studies some new results. Ecology 67:919–928. https://doi.org/10.2307/1939814
Kiviluoto K (1996) Topology preservation in self-organizing maps. In: Proceedings of international conference on neural networks (ICNN'96) IEEE Service Center, Piscataway, Washington, DC, USA, vol 1, pp 294–299 https://doi.org/10.1109/ICNN.1996.548907
Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43:59–69. https://doi.org/10.1007/BF00337288
Kohonen T (2001) Self-organizing maps, 3rd ed. Springer, Berlin. https://doi.org/10.1007/978-3-642-56927-2
Li L, Paulo MJ, van Eeuwijk FA, Gebhardt C (2010) Statistical epistasis between candidate gene alleles for complex tuber traits in an association mapping population of tetraploid potato. Theor Appl Genet 121:1303–1310. https://doi.org/10.1007/s00122-010-1389-3
Luque C, Legala L, Machkour-M'Rabet S, Winterton P, Gers C, Wink M (2009) Apparent influences of host-plant distribution on the structure and the genetic variability of local populations of the Purple Clay (Diarsia brunnea). Biochem Syst Ecol 37:6–15. https://doi.org/10.1016/j.bse.2009.01.008
Mahony S, Hendrix D, Smith TJ, Golden A (2005) Self-organizing maps of position weight matrices for motif discovery in biological sequences. Artif Intell Rev 24:397–413. https://doi.org/10.1007/s10462-005-9011-9
Malosetti M, van der Linden CG, Vosman B, Van Eeuwijk FA (2007) A mixed-model approach to association mapping using pedigree information with an illustration of resistance to Phytophthora infestans in potato. Genetics 175:879–889. https://doi.org/10.1534/genetics.105.054932
Marique T, Allard O, Spanoghe M (2012) Use of self-organizing map to analyze images of fungi colonies grown from Triticum aestivum seeds disinfected by ozone treatment. Int J Microbiol. https://doi.org/10.1155/2012/865175
McCouch S (2004) Diversifying selection in plant breeding. PLoS Biol 2:e347. https://doi.org/10.1371/journal.pbio.0020347
Milbourne D, Meyer RC, Collins AJ, Ramsay LD, Gebhardt C, Waugh R (1998) Isolation, characterization and mapping of simple sequence repeat loci in potato. Mol Gen Genet 259:233–245. https://doi.org/10.1007/s004380050809
Mohammadi SA, Prasanna BM (2003) Review and interpretation analysis of genetic diversity in crop plants—salient statistical tools. Crop Sci 43:1235–1248. https://doi.org/10.2135/cropsci2003.1235
Nei M (1973) Analyses of gene diversity in subdivided populations. Proc Natl Acad Sci 70:3321–3323. https://doi.org/10.1073/pnas.70.12.3321
Nikolic N, Park YS, Sancristobal M, Lek S, Chevalet C (2009) What do artificial neural networks tell us about the genetic structure of populations? The example of European pig populations. Genet Res Camb 91:121–132. https://doi.org/10.1017/S0016672309000093
Park YS, Chon TS, Kwak IS, Lek S (2004) Hierarchical community classification and assessment of aquatic ecosystems using artificial neural networks. Sci Total Environ 327:105–122. https://doi.org/10.1016/j.scitotenv.2004.01.014
Perrier X, Jacquemoud-Collet JP (2006) DARwin software. https://darwin.cirad.fr/darwin
Potato Genome Sequencing Consortium (2011) Genome sequence and analysis of the tuber crop potato. Nature 475:189–195. https://doi.org/10.1038/nature10158
Pritchard JK, Rosenberg NA (1999) Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet 65:220–228. https://doi.org/10.1086/302449
Pritchard JK, Wen W, Falush D (2007) Documentation for STRUCTURE software version 2.3. University of Chicago, Chicago, IL. https://pritch.bsd.uchicago.edu.
Provan J, Powell W, Waugh R (1996) Microsatellite analysis of relationships within cultivated potato (Solanum tuberosum). Theor Appl Genet 92:1078–1084. https://doi.org/10.1007/BF00224052
Ramette A (2007) Multivariate analyses in microbial ecology. FEMS Microbiol Ecol 62:142–160. https://doi.org/10.1111/j.1574-6941.2007.00375.x
Ramírez-Villegas J, Khoury C, Jarvis A, Debouck DG, Guarino L (2010) A gap analysis methodology for collecting crop genepools: a case study with Phaseolus beans. Fuller DQ, ed. PLoS ONE 5:e13497. https://doi.org/10.1371/journal.pone.0013497
Reeves PA, Richards CM (2009) Accurate inference of subtle population structure (and other genetic discontinuities) using principal coordinates. PLoS ONE 4:e4269. https://doi.org/10.1371/journal.pone.0004269
Reid A, Hof L, Felix G, Rucker B, Tams S, Milczynska E, Esselink D, Uenk G, Vosman B, Weitz A (2011) Construction of an integrated microsatellite and key morphological characteristic database of potato varieties on the EU Common Catalogue. Euphytica 182:239–249. https://doi.org/10.1007/s10681-011-0462-6
Rouppe van der Voort J, Kanyuka K, van der Vossen E, Bendahmane A, Mooijman P, Klein-Lankhorst R, Stiekema W, Baulcombe D, Bakker J (1999) Tight physical linkage of the nematode resistance gene Gpa2 and the virus resistance gene Rx on a single segment introgressed from the wild species Solanum tuberosum subsp andigena CPC1673 into cultivated potato. Mol Plant Microbe Interact 12:197–206. https://doi.org/10.1094/MPMI.1999.12.3.197
Roux O, Gevrey M, Arvanitakis L, Gers C, Bordat D, Legal L (2007) ISSR-PCR: Tool for discrimination and genetic structure analysis of Plutella xylostella populations native to different geographical areas. Mol Phylogen Evolut 43:240–250. https://doi.org/10.1016/j.ympev.2006.09.017
Simko I, Costanzo S, Haynes KG, Christ BJ, Jones RW (2004a) Linkage disequilibrium mapping of a Verticillium dahliae resistance quantitative trait locus in tetraploid potato (Solanum tuberosum) through a candidate gene approach. Theor Appl Genet 108:217–224. https://doi.org/10.1007/s00122-003-1431-9
Simko I, Haynes KG, Ewing EE, Costanzo S, Christ BJ, Jones RW (2004b) Mapping genes for resistance to Verticillium albo-atrum in tetraploid and diploid potato populations using haplotype association tests and genetic linkage analysis. Mol Genet Genomics 271:522–531. https://doi.org/10.1007/s00438-004-1010-z
Simko I, Haynes KG, Jones RW (2006) Assessment of linkage disequilibrium in potato genome with single nucleotide polymorphism markers. Genetics 173:2237–2245. https://doi.org/10.1534/genetics.106.060905
Spanoghe M, Marique T, Riviere J, Lanterbecq D, Gadenne M (2015) Investigation and development of potato parentage analysis methods using multiplexed SSR fingerprinting. Potato Res 58:43–65. https://doi.org/10.1007/s11540-014-9271-3
Stich B, Urbany C, Hoffmann P, Gebhardt C (2013) Population structure and linkage disequilibrium in diploid and tetraploid potato revealed by genome-wide high-density genotyping using the SolCAP SNP array. Plant Breed 132:718–724. https://doi.org/10.1111/pbr.12102
Story M, Congalton RG (1986) Accuracy assessment: a user’s perspective. Photogramm Eng Remote Sens 52:397–399
Uitdewilligen JGAML, Wolters A-MA, Bjorn B, Borm TJA, Visser RGF, van Eck HJ (2013) A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato. PLoS ONE 8:e62355. https://doi.org/10.1371/journal.pone.0062355
Veilleux RE, Shen LY, Paz MM (1995) Analysis of the genetic composition of anther-derived potato by randomly amplified polymorphic DNA and simple sequence repeats. Genome 38:1153–1162. https://doi.org/10.1139/g95-153
Vesanto J, Himberg J, Alhoniemi E, Parhankangas J (2000) SOM Toolbox for Matlab 5. Technical Report A57, Neural Networks Research Centre, Helsinki University of Technology, Espoo, Finland, April 2000.
Vos PG, Paulo MJ, Voorrips RE, Visser RGF, van Eck HJ, van Eeuwijk FA (2017) Evaluation of LD decay and various LD-decay estimators in simulated and SNP-array data of tetraploid potato. Theor Appl Genet 130:123–135. https://doi.org/10.1007/s00122-016-2798-8
Zamani N, Russell P, Lantz H, Hoeppner MP, Meadows JRS, Vijay N, Mauceli E, di Palma F, Lindblad-Toh K, Jern P, Grabher MG (2013) Unsupervised genome-wide recognition of local relationship patterns. BMC Genomics 14:347. https://doi.org/10.1186/1471-2164-14-347
Zhao N, Ai W, Shao Z, Zhu B, Brosse S, Chang J (2005) Microsatellites assessment of Chinese sturgeon (Acipenser sinensis Gray) genetic variability. J Appl Ichthyol 21:7–13. https://doi.org/10.1111/j.1439-0426.2004.00630.x
Zhu B, Zhao N, Shao Z, Lek S, Chang J (2006) Genetic population structure of Chinese sturgeon (Acipenser sinensis) in the Yangtze River revealed by artificial neural network. J Appl Ichthyol 22:82–88. https://doi.org/10.1111/j.1439-0426.2007.00932.x
Zhu C, Gore M, Buckler ES, Yu J (2008) Status and prospects of association mapping in plants. Plant Genome 1:5–20. https://doi.org/10.3835/plantgenome2008.02.0089
Acknowledgments
This research was fully supported by the Haute Ecole Provinciale de Hainaut-CONDORCET, 17 Chemin du Champ de Mars, 7000 Mons, Wallonia, Belgium and by the non-profit organization of agricultural services of the Province of Hainaut (CARAH), 11 rue Paul Pastur, 7800 Ath, Wallonia, Belgium. The authors gratefully acknowledged Maarten Vossen and Sjefke Allefs (Agrico Research BV), Maurice Schehr (HZPC Holland BV), Guus Heselmans (C. Meijer BV), Peter Oldenkamp (KWS Potato BV), Eric Bonnel and Gisèle Joly (Germicopa SAS), Hervé van den Wyngaert (Binst Breeding Selection), Jens Kr. Ege Olsen (LKF Vandel, Danespo A/S), Sylvie Marhadour (FN3PT), Curzio Caravati (Kenosha Potato Project) and Olivier Mahieu (CARAH asbl) for providing the plant material and associated information used in the present study. The help of Jeromy Hrabovecky and the anonymous reviewer is kindly acknowledged for the revision of the manuscript.
Author information
Authors and Affiliations
Contributions
MS and TM conceived the experiments. MS and JR performed the plant material sampling. MS, MM, JR, AN and CD achieved DNA extractions. MS and MM executed the genotyping analysis. MS, TM, AN and MB analyzed and interpreted the results. DL supervised the experiments and revised the draft manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
Plant material and associate data were collected with the best intentions by donors and the authors, respectively. Although no abnormality was detected, the authors are not responsible for any potential inaccuracy in the results except for those arising from genotyping analyses. Legal issues or disputes raised by owners of germplasm cannot be based on data or conclusions disclosed in this publication. The authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Spanoghe, M.C., Marique, T., Rivière, J. et al. Genetic patterns recognition in crop species using self-organizing map: the example of the highly heterozygous autotetraploid potato (Solanum tuberosum L.). Genet Resour Crop Evol 67, 947–966 (2020). https://doi.org/10.1007/s10722-020-00894-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10722-020-00894-8