An innovative procedure of genome-wide association analysis fits studies on germplasm population and plant breeding
The innovative RTM-GWAS procedure provides a relatively thorough detection of QTL and their multiple alleles for germplasm population characterization, gene network identification, and genomic selection strategy innovation in plant breeding.
The previous genome-wide association studies (GWAS) have been concentrated on finding a handful of major quantitative trait loci (QTL), but plant breeders are interested in revealing the whole-genome QTL-allele constitution in breeding materials/germplasm (in which tremendous historical allelic variation has been accumulated) for genome-wide improvement. To match this requirement, two innovations were suggested for GWAS: first grouping tightly linked sequential SNPs into linkage disequilibrium blocks (SNPLDBs) to form markers with multi-allelic haplotypes, and second utilizing two-stage association analysis for QTL identification, where the markers were preselected by single-locus model followed by multi-locus multi-allele model stepwise regression. Our proposed GWAS procedure is characterized as a novel restricted two-stage multi-locus multi-allele GWAS (RTM-GWAS, https://github.com/njau-sri/rtm-gwas). The Chinese soybean germplasm population (CSGP) composed of 1024 accessions with 36,952 SNPLDBs (generated from 145,558 SNPs, with reduced linkage disequilibrium decay distance) was used to demonstrate the power and efficiency of RTM-GWAS. Using the CSGP marker information, simulation studies demonstrated that RTM-GWAS achieved the highest QTL detection power and efficiency compared with the previous procedures, especially under large sample size and high trait heritability conditions. A relatively thorough detection of QTL with their multiple alleles was achieved by RTM-GWAS compared with the linear mixed model method on 100-seed weight in CSGP. A QTL-allele matrix (402 alleles of 139 QTL × 1024 accessions) was established as a compact form of the population genetic constitution. The 100-seed weight QTL-allele matrix was used for genetic characterization, candidate gene prediction, and genomic selection for optimal crosses in the germplasm population.
This work was supported by the China National Key R & D Program for Crop Breeding (2016YFD0100304), the China National Key Basic Research Program (2011CB1093), the China National Hightech R&D Program (2012AA101106), the Natural Science Foundation of China (31571695), the MOE 111 Project (B08025), Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT13073), the MOA Public Profit Program (201203026-4), the MOA CARS-04 program, the Jiangsu Higher Education PAPD Program, and the Jiangsu JCIC-MCP Program. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Compliance with ethical standards
Conflict of interest
The authors declare no conflict of interest.
- Atwell S, Huang YS, Vilhjalmsson BJ, Willems G, Horton M, Li Y, Meng D, Platt A, Tarone AM, Hu TT, Jiang R, Muliyati NW, Zhang X, Amer MA, Baxter I, Brachi B, Chory J, Dean C, Debieu M, de Meaux J, Ecker JR, Faure N, Kniskern JM, Jones JD, Michael T, Nemri A, Roux F, Salt DE, Tang C, Todesco M, Traw MB, Weigel D, Marjoram P, Borevitz JO, Bergelson J, Nordborg M (2010) Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465:627–631CrossRefPubMedPubMedCentralGoogle Scholar
- Felsenstein J (1989) PHYLIP—phylogeny inference package (version 3.2). Cladistics 5:164–166Google Scholar
- Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D (2002) The structure of haplotype blocks in the human genome. Science 296:2225–2229CrossRefPubMedGoogle Scholar
- Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y, Li C, Zhu C, Lu T, Zhang Z, Li M, Fan D, Guo Y, Wang A, Wang L, Deng L, Li W, Lu Y, Weng Q, Liu K, Huang T, Zhou T, Jing Y, Li W, Lin Z, Buckler ES, Qian Q, Zhang QF, Li J, Han B (2010) Genome-wide association studies of 14 agronomic traits in rice landraces. Nat Genet 42:961–967CrossRefPubMedGoogle Scholar
- Jia G, Huang X, Zhi H, Zhao Y, Zhao Q, Li W, Chai Y, Yang L, Liu K, Lu H, Zhu C, Lu Y, Zhou C, Fan D, Weng Q, Guo Y, Huang T, Zhang L, Lu T, Feng Q, Hao H, Liu H, Lu P, Zhang N, Li Y, Guo E, Wang S, Wang S, Liu J, Zhang W, Chen G, Zhang B, Li W, Wang Y, Li H, Zhao B, Li J, Diao X, Han B (2013) A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica). Nat Genet 45:957–961CrossRefPubMedGoogle Scholar
- Li H, Peng Z, Yang X, Wang W, Fu J, Wang J, Han Y, Chai Y, Guo T, Yang N, Liu J, Warburton ML, Cheng Y, Hao X, Zhang P, Zhao J, Liu Y, Wang G, Li J, Yan J (2013) Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nat Genet 45:43–50CrossRefPubMedGoogle Scholar
- Morris GP, Ramu P, Deshpande SP, Hash CT, Shah T, Upadhyaya HD, Riera-Lizarazu O, Brown PJ, Acharya CB, Mitchell SE, Harriman J, Glaubitz JC, Buckler ES, Kresovich S (2013) Population genomic and genome-wide association studies of agroclimatic traits in sorghum. Proc Natl Acad Sci USA 110:453–458CrossRefPubMedGoogle Scholar
- Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA (2010) Genome sequence of the palaeopolyploid soybean. Nature 463:178–183CrossRefPubMedGoogle Scholar
- Vazquez AI, Veturi Y, Behring M, Shrestha S, Kirst M, Resende MF Jr, de Los Campos G (2016) Increased proportion of variance explained and prediction accuracy of survival of breast cancer patients with use of whole-genome multiomic profiles. Genetics 203:1425–1438CrossRefPubMedPubMedCentralGoogle Scholar
- Zhang Y, He J, Wang Y, Xing G, Zhao J, Li Y, Yang S, Palmer RG, Zhao T, Gai J (2015a) Establishment of a 100-seed weight quantitative trait locus-allele matrix of the germplasm population for optimal recombination design in soybean breeding programmes. J Exp Bot 66:6311–6325CrossRefPubMedGoogle Scholar
- Zhao K, Tung CW, Eizenga GC, Wright MH, Ali ML, Price AH, Norton GJ, Islam MR, Reynolds A, Mezey J, McClung AM, Bustamante CD, McCouch SR (2011) Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nat Commun 2:467CrossRefPubMedPubMedCentralGoogle Scholar
- Zhou Z, Jiang Y, Wang Z, Gou Z, Lyu J, Li W, Yu Y, Shu L, Zhao Y, Ma Y, Fang C, Shen Y, Liu T, Li C, Li Q, Wu M, Wang M, Wu Y, Dong Y, Wan W, Wang X, Ding Z, Gao Y, Xiang H, Zhu B, Lee SH, Wang W, Tian Z (2015) Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat Biotechnol 33:408–414CrossRefPubMedGoogle Scholar