Population structure, genetic diversity and linkage disequilibrium in a macadamia breeding population using SNP and silicoDArT markers

Abstract

Macadamia (Macadamia integrifolia Maiden & Betche, Macadamia tetraphylla L.A.S. Johnson and their hybrids) is grown commercially around the world for its high-quality edible kernel. Traditional breeding efforts involve crossing varieties to produce thousands of progeny seedlings for evaluation. Cultivar improvement for nut yield using component traits and genomics are options for macadamia breeding, but accurate knowledge of genetic diversity and structure of the breeding population is required. This study reports allelic diversity within and between families of 295 seedling offspring from 29 parents, population structure and the extent of linkage disequilibrium (LD) in the population. Genotyping generated 19,527 silicoDArT and 5329 SNP markers, and, after filtering, 16,171 silicoDArTs and 4113 SNPs were used for diversity analyses. LD decay was initially rapid at short distances, but low-level LD persisted for long distances, with an average r2 = 0.124 for SNPs within 1 kb of each other. The seedling population was relatively genetically diverse and very similar to that of the 29 parents. The diversity (HE = 0.255 for progeny and 0.250 for parents) among these individuals indicates the level of diversity at the wider population level in the breeding programme, though the population appears less diverse than other fruit crops. Macadamia progeny was moderately differentiated (FST = 0.401) and formed k = 3 distinct clusters, which represents M. integrifolia germplasm separating from two different hybrid groups. There was low to no relationship between heterozygosity and performance for nut yield among progeny. These findings will inform future genomic studies of the Australian macadamia breeding programme, such as genome-wide association studies and genomic selection, where knowledge and control of population structure are vital.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Akagi T, Hanada T, Yaegaki H, Gradziel TM, Tao R (2016) Genome-wide view of genetic diversity reveals paths of selection and cultivar differentiation in peach domestication. DNA Res 23(3):271–282

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  2. Alam M, Neal J, O’Connor K, Kilian A, Topp B (2018) Ultra-high-throughput DArTseq-based silicoDArT and SNP markers for genomic studies in macadamia. PLoS One 13(8):e0203465

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  3. Aradhya MK, Yee LK, Zee FT, Manshardt RM (1998) Genetic variability in Macadamia. Genet Resour Crop Evol 45(1):19–32

    Article  Google Scholar 

  4. Australian Macadamia Society (2017). Australia’s macadamia industry in numbers

    Google Scholar 

  5. Barrett B, Kidwell K, Fox P (1998) Comparison of AFLP and pedigree-based genetic diversity assessment methods using wheat cultivars from the Pacific Northwest. Crop Sci 38(5):1271–1278

    CAS  Article  Google Scholar 

  6. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinf 10(1):421

    Article  CAS  Google Scholar 

  7. Chagné D (2015) Chapter one—whole genome sequencing of fruit tree species. In: Christophe P, Anne-Françoise A-B (eds) Advances in botanical research, vol 74. Academic Press, Cambridge, pp 1–37

    Google Scholar 

  8. Chapman J, Nakagawa S, Coltman D, Slate J, Sheldon B (2009) A quantitative review of heterozygosity–fitness correlations in animal populations. Mol Ecol 18(13):2746–2765

    CAS  PubMed  Article  Google Scholar 

  9. Chen W, Hou L, Zhang Z, Pang X, Li Y (2017) Genetic diversity, population structure, and linkage disequilibrium of a core collection of Ziziphus jujuba assessed with genome-wide SNPs developed by genotyping-by-sequencing and SSR markers. Front Plant Sci 8:575

  10. Cros D, Bocs S, Riou V, Ortega-Abboud E, Tisné S, Argout X, Pomiès V, Nodichao L, Lubis Z, Cochard B (2017) Genomic preselection with genotyping-by-sequencing increases performance of commercial oil palm hybrid crosses. BMC Genomics 18(1):839

    PubMed  PubMed Central  Article  Google Scholar 

  11. da Rocha Sobierajski G (2012) Development and use of SSR and DArT genetic markers to study genetic diversity in macadamia (Macadamia integrifolia). Escola Superior de Agricultura Luiz de Queiroz, Piracicaba

    Google Scholar 

  12. Diehl WJ, Biesiot PM (1994) Relationships between multilocus heterozygosity and morphometric indices in a population of the deep-sea red crab Chaceon quinquedens (Smith). J Exp Mar Biol Ecol 182(2):237–250

    Article  Google Scholar 

  13. Douglas JA, Skol AD, Boehnke M (2002) Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data. Am J Hum Genet 70(2):487–495

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. Earl DA, vonHoldt BM (2012) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour 4(2):359–361

    Article  Google Scholar 

  15. Emanuelli F, Lorenzi S, Grzeskowiak L, Catalano V, Stefanini M, Troggio M, Myles S, Martinez-Zapater JM, Zyprian E, Moreira FM (2013) Genetic diversity and population structure assessed by SSR and SNP markers in a large germplasm collection of grape. BMC Plant Biol 13(1):39

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14(8):2611–2620

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. Falconer, D. S. & Mackay, T. F. (1996). Introduction to quantitative genetics (4th edn). Trends in genetics 12(7): 280

    Google Scholar 

  18. Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164(4):1567–1587

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Fischer MC, Rellstab C, Leuzinger M, Roumet M, Gugerli F, Shimizu KK, Holderegger R, Widmer A (2017) Estimating genomic diversity and population differentiation—an empirical comparison of microsatellite and SNP variation in Arabidopsis halleri. BMC Genomics 18(1):69

    PubMed  PubMed Central  Article  Google Scholar 

  20. Flint-Garcia SA, Thornsberry JM, Buckler ES IV (2003) Structure of linkage disequilibrium in plants. Annu Rev Plant Biol 54(1):357–374

    CAS  PubMed  Article  Google Scholar 

  21. Gitonga L, Muigai A, Kahangi E, Ngamau K, Gichuki S (2009) Status of macadamia production in Kenya and the potential of biotechnology in enhancing its genetic improvement. J Plant Breed Crop Sci 1(3):049–059

    CAS  Google Scholar 

  22. Glaszmann J-C, Kilian B, Upadhyaya HD, Varshney RK (2010) Accessing genetic diversity for crop improvement. Curr Opin Plant Biol 13(2):167–173

    CAS  PubMed  Article  Google Scholar 

  23. Govindaraj M, Vetriventhan M, Srinivasan M (2015) Importance of genetic diversity assessment in crop plants and its recent advances: an overview of its analytical perspectives. Genet Res Int 2015:14

  24. Grzebelus D, Iorizzo M, Senalik D, Ellison S, Cavagnaro P, Macko-Podgorni A, Heller-Uszynska K, Kilian A, Nothnagel T, Allender C, Simon PW, Baranski R (2014) Diversity, genetic mapping, and signatures of domestication in the carrot (Daucus carota L.) genome, as revealed by Diversity Arrays Technology (DArT) markers. Mol Breed 33(3):625–637

    CAS  PubMed  Article  Google Scholar 

  25. Gupta PK, Rustgi S, Mir RR (2008) Array-based high-throughput DNA markers for crop improvement. Heredity 101:5–18

    CAS  PubMed  Article  Google Scholar 

  26. Hardenbol P, Yu F, Belmont J, MacKenzie J, Bruckner C, Brundage T, Boudreau A, Chow S, Eberle J, Erbilgin A (2005) Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay. Genome Res 15(2):269–275

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. Hardner C (2016) Macadamia domestication in Hawai’i. Genet Resour Crop Evol 63(8:1411–1430

    Article  Google Scholar 

  28. Hardner CM, Peace C, Lowe AJ, Neal J, Pisanu P, Powell M, Schmidt A, Spain C, Williams K (2009) Genetic resources and domestication of Macadamia. Hortic Rev 35:1–126

    Google Scholar 

  29. Hayes BJ, Visscher PM, McPartlan HC, Goddard ME (2003) Novel multilocus measure of linkage disequilibrium to estimate past effective population size. Genome Res 13(4):635–643

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. Hayes BJ, Visscher PM, Goddard ME (2009) Increased accuracy of artificial selection by using the realized relationship matrix. Genet Res 91(1):47–60

    CAS  Article  Google Scholar 

  31. Heffner EL, Jannink J-L, Sorrells ME (2011) Genomic selection accuracy using multifamily prediction models in a wheat breeding program. Plant Genome 4(1):65–75

    Article  Google Scholar 

  32. Heslot N, Yang H-P, Sorrells ME, Jannink J-L (2012) Genomic selection in plant breeding: a comparison of models. Crop Sci 52(1):146–160

    Article  Google Scholar 

  33. Imai A, Nonaka K, Kuniga T, Yoshioka T, Hayashi T (2018) Genome-wide association mapping of fruit-quality traits using genotyping-by-sequencing approach in citrus landraces, modern cultivars, and breeding lines in Japan. Tree Genet Genomes 14(2):24

    Article  Google Scholar 

  34. Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23(14):1801–1806

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. Ji K, Zhang D, Motilal LA, Boccara M, Lachenaud P, Meinhardt LW (2013) Genetic diversity and parentage in farmer varieties of cacao (Theobroma cacao L.) from Honduras and Nicaragua as revealed by single nucleotide polymorphism (SNP) markers. Genet Resour Crop Evol 60(2):441–453

    Article  Google Scholar 

  36. Khan MA, Korban SS (2012) Association mapping in forest trees and fruit crops. J Exp Bot 63(11):4045–4060

    CAS  PubMed  Article  Google Scholar 

  37. Kilian A, Huttner E, Wenzl P, Jaccoud D, Carling J, Caig V, Evers M, Heller-Uszynska K, Cayla C, Patarapuwadol S, Xia L (2003) The fast and the cheap: SNP and DArT-based whole genome profiling for crop improvement. In: Tuberosa R, Phillips RL, Gale M (eds.) Proceedings of the international congress In the wake of the double helix: from the green revolution to the gene revolution, Bologna, pp 443–461

  38. Kilian A, Wenzl P, Huttner E, Carling J, Xia L, Blois H, Caig V, Heller-Uszynska K, Jaccoud D, Hopper C (2012) Diversity arrays technology: a generic genome profiling technology on open platforms. In: Pompanon F, Bonin A (eds) Data production and analysis in population genomics. Methods in Molecular Biology (Methods and Protocols), vol 888. Humana Press, Totowa, NJ

    Google Scholar 

  39. Krawczak M (1999) Informativity assessment for biallelic single nucleotide polymorphisms. Electrophoresis: Int J 20(8):1676–1681

    CAS  Article  Google Scholar 

  40. Kumar S, Chagne D, Bink MC, Volz RK, Whitworth C, Carlisle C (2012) Genomic selection for fruit quality traits in apple (Malus × domestica Borkh.). PLoS ONE 7(5):e36674

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. Kumar S, Kirk C, Deng C, Wiedow C, Knaebel M, Brewer L (2017) Genotyping-by-sequencing of pear (Pyrus spp.) accessions unravels novel patterns of genetic diversity and selection footprints. Horticulture Research 4:17015

    PubMed  PubMed Central  Article  Google Scholar 

  42. Larsen B, Gardner K, Pedersen C, Ørgaard M, Migicovsky Z, Myles S, Toldam-Andersen TB (2018) Population structure, relatedness and ploidy levels in an apple gene bank revealed through genotyping-by-sequencing. PLoS One 13(8):e0201889

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  43. Mohammadi S, Prasanna B (2003) Analysis of genetic diversity in crop plants—salient statistical tools and considerations. Crop Sci 43(4):1235–1248

    Article  Google Scholar 

  44. Nei M (1972) Genetic distance between populations. Am Nat 106(949):283–292

    Article  Google Scholar 

  45. Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12(6):443–451

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. Nock CJ, Elphinstone MS, Ablett G, Kawamata A, Hancock W, Hardner CM, King GJ (2014) Whole genome shotgun sequences for microsatellite discovery and application in cultivated and wild Macadamia (Proteaceae). Appl Plant Sci 2(4):1300089

    Article  Google Scholar 

  47. Nock CJ, Baten A, Barkla BJ, Furtado A, Henry RJ, King GJ (2016) Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae). BMC Genomics 17(1):937

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  48. Nybom H, Weising K, Rotter B (2014) DNA fingerprinting in botany: past, present, future. Investig Genet 5(1):1

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  49. O'Connor K, Powell M, Nock C, Shapcott A (2015) Crop to wild gene flow and genetic diversity in a vulnerable Macadamia (Proteaceae) species in New South Wales, Australia. Biol Conserv 191:504–511

    Article  Google Scholar 

  50. Pandey MK, Upadhyaya HD, Rathore A, Vadez V, Sheshshayee M, Sriswathi M, Govil M, Kumar A, Gowda M, Sharma S (2014) Genomewide association studies for 50 agronomic traits in peanut using the ‘reference set’ comprising 300 genotypes from 48 countries of the semi-arid tropics of the world. PLoS One 9(8):e105228

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  51. Pavlopoulos GA, Soldatos TG, Barbosa-Silva A, Schneider R (2010) A reference guide for tree analysis and visualization. BioData Mining 3:1–1

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  52. Peace C (2005) Genetic characterisation of Macadamia with DNA markers. University of Queensland, St Lucia

    Google Scholar 

  53. Peace CP, Vithanage V, Turnbull CGN, Carroll BJ (2003) A genetic map of macadamia based on randomly amplified DNA fingerprinting (RAF) markers. Euphytica 134(1):17–26

    CAS  Article  Google Scholar 

  54. Peace C, Vithanage V, Neal J (2004) A comparison of molecular markers for genetic analysis of macadamia. J Hortic Sci Biotechnol 79(6):965–970

    CAS  Article  Google Scholar 

  55. Peace CP, Allan P, Vithanage V, Turnbull CN, Carroll BJ (2005) Genetic relationships amongst macadamia varieties grown in South Africa as assessed by RAF markers. S Afr J Plant Soil 22(2):71–75

    CAS  Article  Google Scholar 

  56. Peakall R, Smouse P (2012) GenAlEx 6.5: genetic analysis in excel. Population genetic software for teaching and research—an update. Bioinformatics 28(19):2537–2539

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  57. Perrier X, Jacquemoud-Collet JP (2006) DARwin software http://darwin.cirad.fr/darwin

  58. Pompanon F, Bonin A, Bellemain E, Taberlet P (2005) Genotyping errors: causes, consequences and solutions. Nat Rev Genet 6(11):847–859

    CAS  PubMed  Article  Google Scholar 

  59. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. R Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

    Google Scholar 

  62. Roorkiwal M, Von Wettberg EJ, Upadhyaya HD, Warschefsky E, Rathore A, Varshney RK (2014) Exploring germplasm diversity to understand the domestication process in Cicer spp. using SNP and DArT markers. PLoS ONE 9(7):e102016

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  63. Rosenberg NA (2004) DISTRUCT: a program for the graphical display of population structure. Mol Ecol Resour 4(1):137–138

    Article  Google Scholar 

  64. Sagawa C, Cristofani-Yaly M, Novelli V, Bastianel M, Machado M (2018) Assessing genetic diversity of Citrus by DArT_seq™ genotyping. Plant Biosystems 152:593–598

    Article  Google Scholar 

  65. Sánchez-Sevilla JF, Horvath A, Botella MA, Gaston A, Folta K, Kilian A, Denoyes B, Amaya I (2015) Diversity Arrays Technology (DArT) marker platforms for diversity analysis and linkage mapping in a complex crop, the octoploid cultivated strawberry (Fragaria × ananassa). PLoS One 10(12):e0144960

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  66. Saunders IW, Brohede J, Hannan GN (2007) Estimating genotyping error rates from Mendelian errors in SNP array genotypes and their impact on inference. Genomics 90(3):291–296

    CAS  PubMed  Article  Google Scholar 

  67. Schmidt AL, Scott L, Lowe AJ (2006) Isolation and characterization of microsatellite loci from Macadamia. Mol Ecol Notes 6(4):1060–1063

    CAS  Article  Google Scholar 

  68. Semagn K, Bjørnstad Å, Xu Y (2010) The genetic dissection of quantitative traits in crops. Electron J Biotechnol 13(5):16–17

    Article  Google Scholar 

  69. Stacklies W, Redestig H, Scholz M, Walther D, Selbig J (2007) pcaMethods—a bioconductor package providing PCA methods for incomplete data. Bioinformatics 23(9):1164–1167

    CAS  PubMed  Article  Google Scholar 

  70. Steiger DL, Moore PH, Zee F, Liu Z, Ming R (2003) Genetic relationships of macadamia cultivars and species revealed by AFLP markers. Euphytica 132(3):269–277

    CAS  Article  Google Scholar 

  71. Tanksley SD, McCouch SR (1997) Seed banks and molecular maps: unlocking genetic potential from the wild. Science 277(5329):1063–1066

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  72. Topp B, Hardner CM, Neal J, Kelly A, Russell D, McConchie C, O'Hare PJ (2016) Overview of the Australian macadamia industry breeding program. Acta Hortic 1127:45–50

    Article  Google Scholar 

  73. Vanderzande S, Micheletti D, Troggio M, Davey MW, Keulemans J (2017) Genetic diversity, population structure, and linkage disequilibrium of elite and local apple accessions from Belgium using the IRSC array. Tree Genet Genomes 13(6):125

    Article  Google Scholar 

  74. VanRaden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91(11):4414–4423

    CAS  Article  Google Scholar 

  75. Vithanage V, Winks C (1992) Isozymes as genetic markers for Macadamia. Sci Hortic 49(1):103–115

    CAS  Article  Google Scholar 

  76. Vithanage V, Hardner C, Anderson K, Meyers N, McConchie C, Peace C (1997) Progress made with molecular markers for genetic improvement of macadamia. In International Symposium on Biotechnology of Tropical and Subtropical Species Part 2 461, 199–208

  77. Wright S (1965) The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution 19(3):395–420

    Article  Google Scholar 

  78. Xie R, Li X, Chai M, Song L, Jia H, Wu D, Chen M, Chen K, Aranzana MJ, Gao Z (2010) Evaluation of the genetic diversity of Asian peach accessions using a selected set of SSR markers. Sci Hortic 125(4):622–629

    CAS  Article  Google Scholar 

  79. Zouros E, Foltz D (1987) The use of allelic isozyme variation for the study of heterosis. Isozymes 13:1

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This research has been funded by Hort Innovation, using the Macadamia research and development levy and contributions from the Australian Government. Hort Innovation is the grower-owned, not-for-profit research and development corporation for Australian horticulture. KO thanks the Australian Postgraduate Award and Charles Morphett Peglar scholarship for financial support, macadamia orchard growers and managers, as well as the team at the Diversity Arrays Technology for their guidance. Thanks also to Dr. Mark Dieters, Dr. Jodi Neal, Dr. David Innes, Professor Robert Henry and Kirsty Langdon for their suggestions and comments.

Data archiving statement

The SNP and silicoDArT markers generated and analysed during the current study are obtainable from the University of Queensland’s Institutional Data Access/Ethics Committee, but restrictions apply to the availability of these data. The dataset “SNPs and silicoDArT markers of B1.2 progeny and parents” is available at https://doi.org/10.14264/uql.2018.491 for researchers who meet the criteria for access to confidential data. Contact data@library.uq.edu.au

Author information

Affiliations

Authors

Corresponding author

Correspondence to Katie O’Connor.

Ethics declarations

Competing interests

Andrzej Kilian is employed by the Diversity Arrays Technology Pty Ltd. which provided genotyping services in this study, but this had no effect on the conclusions of this study.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by M. Wirthensohn

Electronic supplementary material

ESM 1

(DOCX 69 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

O’Connor, K., Kilian, A., Hayes, B. et al. Population structure, genetic diversity and linkage disequilibrium in a macadamia breeding population using SNP and silicoDArT markers. Tree Genetics & Genomes 15, 24 (2019). https://doi.org/10.1007/s11295-019-1331-z

Download citation

Keywords

  • Horticulture
  • Plant breeding
  • Progeny
  • Genomics
  • Diversity Arrays Technology