Abstract
Key message
Using genomic structural equation modelling, this research demonstrates an efficient way to identify genetically correlating traits and provides an effective proxy for multi-trait selection to consider the joint genetic architecture of multiple interacting traits in crop breeding.
Abstract
Breeding crop cultivars with optimal value across multiple traits has been a challenge, as traits may negatively correlate due to pleiotropy or genetic linkage. For example, grain yield and grain protein content correlate negatively with each other in cereal crops. Future crop breeding needs to be based on practical yet accurate evaluation and effective selection of beneficial trait to retain genes with the best agronomic score for multiple traits. Here, we test the framework of whole-system-based approach using structural equation modelling (SEM) to investigate how one trait affects others to guide the optimal selection of a combination of agronomically important traits. Using ten traits and genome-wide SNP profiles from a worldwide barley panel and SEM analysis, we revealed a network of interacting traits, in which tiller number contributes positively to both grain yield and protein content; we further identified common genetic factors affecting multiple traits in the network of interaction. Our method demonstrates an efficient way to identify genetically correlating traits and underlying pleiotropic genetic factors and provides an effective proxy for multi-trait selection within a whole-system framework that considers the joint genetic architecture of multiple interacting traits in crop breeding. Our findings suggest the promise of a whole-system approach to overcome challenges such as the negative correlation of grain yield and protein content to facilitating quantitative and objective breeding decisions in future crop breeding.




Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
Data deposited in public depositories with link provided.
References
Agrawal AA, Conner JK, Rasmann S (2010) Tradeoffs and negative correlations in evolutionary ecology. In: Bell MA, Eanes WF, Futuyma DJ, Levinton JS (eds) Evolution After Darwin: the First 150 Years. Sinauer Associates, Sunderland, pp 243–268
Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, Banks E, Garimella KV, Altshuler D, Gabriel S, DePristo MA (2013) From fastQ data to high-confidence variant calls, the genome analysis toolkit best practices pipeline. Cur Prot Bioinform. https://doi.org/10.1002/0471250953.bi1110s43
Bernardo R (2020) Reinventing quantitative genetics for plant breeding: something old something new something borrowed something BLUE. Heredity 125(6):375–385. https://doi.org/10.1038/s41437-020-0312-1
Bhatta M, Gutierrez L, Cammarota L, Cardozo F, Germán S, Gómez-Guerrero B, Pardo MF, Lanaro V, Sayas M, Castro AJ. (2020) Multi-trait genomic prediction model increased the predictive ability for agronomic and malting quality traits in barley. G3 Genes Genomes Genetics, 10(3):1113–1124. https://doi.org/10.1534/g3.119.400968
Bulik-Sullivan B, Loh P-R, Finucane HK, Ripke S, Yang J, Patterson N, Daly MJ, Price AL, Neale BM, Schizophrenia Working Group of the Psychiatric Genomics Consortium (2015) LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 47(3):291–295. https://doi.org/10.1038/ng.3211
Butler DG, Cullis BR, Gilmour AR, Gogel BG and Thompson R (2017) ASReml-R Reference Manual Version 4. VSN International Ltd, Hemel Hempstead, HP1 1ES, UK.
Chang CC, Chow CC, Tellier LCAM, Vattikuti S, Purcell SM, Lee JJ (2015) Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. https://doi.org/10.1186/s13742-015-0047-8
Cullis BR, Smith AB, Coombes NE (2006) On the design of early generation variety trials with correlated data. J Agr Biol Environ Stat 11:381. https://doi.org/10.1198/108571106X154443
Fuller MP, Kaniouras AM, Christophers S, Fredericks JT (2007) The freezing characteristics of wheat at ear emergence. Europ J Agr 26(4):435–441. https://doi.org/10.1016/j.eja.2007.01.001
George D, Mallery M (2010). SPSS for Windows Step by Step: A Simple Guide and Reference, 17.0 update (10a ed.) Boston: Pearson.
Gianola D, Sorensen D (2004) Quantitative genetic models for describing simultaneous and recursive relationships between phenotypes. Genetics 167(3):1407–1424. https://doi.org/10.1534/genetics.103.025734
Grace JB, Schoolmaster DR Jr, Guntenspergen GR, Little AM, Mitchell BR, Miller KM, Schweiger EW (2012) Guidelines for a graph-theoretic implementation of structural equation modeling. Ecosphere 3(8):1–44. https://doi.org/10.1890/ES12-00048.1
Gravetter F, Wallnau L (2014) Essentials of statistics for the behavioral sciences, 8th edn. Wadsworth, Belmont, CA
Grotzinger AD, Rhemtulla M, de Vlaming R, Ritchie SJ, Mallard TT, Hill WD, Ip HF, Marioni RE, McIntosh AM, Deary IJ, Koellinger PD (2019) Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat Human Behav 3(5):513–525. https://doi.org/10.1038/s41562-019-0566-x
Hammer O, Harper DAT, Ryan PD (2001) PAST: paleontological statistics software package for education and data analysis. Paleontol Electron 4:9
Henderson CR, Quaas RL (1976) Multiple trait evaluation using relatives’ records. J Anim Sci 43(6):1188–1197. https://doi.org/10.2527/jas1976.4361188x
He T, Hill CB, Angessa TT, Zhang XQ, Moody D, Telfer P, Westcott S, Li C (2019) Gene-set association and epistatic analyses reveal complex gene interaction networks affecting flowering time in a worldwide barley collection. J Exp Bot 70(2):5603–5616. https://doi.org/10.1093/jxb/erz332
Hill CB, Angessa TT, McFawn L-A, Wong D, Tibbits J, Zhang X-Q, Forrest K, Moody D, Telfer P, Westcot S, Diepeveen D, Xu Y, Tan C, Hayden M, Li C (2019a) Hybridisation-based target enrichment of phenology genes to dissect the genetic basis of yield and adaptation in barley. Plant Biotech J 17(5):932–944. https://doi.org/10.1111/pbi.13029
Hill CB, Won D, Tibbits J, Forrest K, Hayden M, Zhang X-Q, Westcott S, Angessa TT, Li C (2019b) Targeted enrichment by solution-based hybrid capture to identify genetic sequence variants in barley. Sci Data 6:12. https://doi.org/10.1038/s41597-019-0011-z
Hooper D, MullenM CJ (2008) Structural equation modelling, Guidelines for determining model fit. Electron J Bus Res Methods 6(1):53–60
Huang YF, Madur D, Combes V, Ky CL, Coubriche D, Jamin P, Jouanne S, Dumas F, Bouty E, Bertin P, Charcosset A (2010) The genetic architecture of grain yield and related traits in Zea mays L. revealed by comparing intermated and conventional populations. Genetics 186(1):395–404. https://doi.org/10.1534/genetics.110.113878
Hu LT, Bentler PM (1999) Cutoff criteria for fit Indexes in covariance structure analysis, conventional criteria versus new alternatives. Struct Equ Modeling 6(1):1–55. https://doi.org/10.1080/10705519909540118
Jiang GL (2013) Plant marker-assisted breeding and conventional breeding challenges and perspectives. Adv Crop Sci Tech 1(3):e106. https://doi.org/10.4172/2329-8863.1000e106
Jordan DR, Hunt CH, Cruickshank AW, Borrell AK, Henzell RG (2012) The relationship between the stay-green trait and grain yield in elite sorghum hybrids grown in a range of environments. Crop Sci 52(3):1153–1161. https://doi.org/10.2135/cropsci2011.06.0326
Kibite S, Evans LE (1984) Causes of negative correlations between grain yield and grain protein concentration in common wheat. Euphytica 33(11):801–810. https://doi.org/10.1007/BF00021906
Kruijer W, Behrouzi P, Bustos-Korts D, Rodríguez-Álvarez MX, Mahmoudi SM, Yandell B, Wit E, van Eeuwijk FA (2020) Reconstruction of networks with direct and indirect genetic effects. Genetics 214(4):781–807. https://doi.org/10.1534/genetics.119.302949
Leal-Gutiérrez JD, Rezende FM, Elzo MA, Dwain Johnson D, Peñagaricano F, Mateescu RG (2018) Structural equation modeling and whole-genome scans uncover chromosome regions and enriched pathways for carcass and meat quality in beef. Front Genet 9:1–13. https://doi.org/10.3389/fgene.2018.00532
Lee SH, Yang J, Goddard ME, Visscher PM, Wray NR (2012) Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28(19):2540–2542. https://doi.org/10.1093/bioinformatics/bts474
Lee JJ, McGue M, Iacono WG, Chow CC (2018) The accuracy of LD Score regression as an estimator of confounding and genetic correlations in genome-wide association studies. Genet Epidemiol 42:783–795. https://doi.org/10.1002/gepi.22161
Li H, Durbin R (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26(5):589–595. https://doi.org/10.1093/bioinformatics/btp698
Listgarten J, Lippert C, Kadie CM, Davidson RI, Eskin E, Heckerman D (2012) Improved linear mixed models for genome-wide association studies. Nat Methods 9(6):525–526. https://doi.org/10.1038/nmeth.2037
Li CD, Cakir M, Lance R (2009) Genetic improvement of malting quality through conventional breeding and marker-assisted selection. In: Zhang GP, Li CD (eds) Genetics and improvement of barley malting quality. The Springer, Berlin, Heidelberg, pp 260–292
Mascher M, Gundlach H, Himmelbach A, Beier S, Twardziok SO, Wicker T, Radchuk V, Dockter C, Hedley PE, Russell J, Bayer M, Ramsay L, Liu H, Haberer G, Zhang XQ, Zhang Q, Barrero RA, Li L, Taudien S, Groth M, Felder M, Hastie A, Šimková H, Staňková H, Vrána J, Chan S, Muñoz-Amatriaín M, Ounit R, Wanamaker S, Bolser D, Colmsee C, Schmutzer T, Aliyeva-Schnorr L, Grasso S, Tanskanen J, Chailyan A, Sampath D, Heavens D, Clissold L, Cao S, Chapman B, Dai F, Han Y, Li H, Li X, Lin C, McCooke JK, Tan C, Wang P, Wang S, Yin S, Zhou G, Poland JA, Bellgard MI, Borisjuk L, Houben A, Doležel J, Ayling S, Lonardi S, Kersey P, Langridge P, Muehlbauer GJ, Clark MD, Caccamo M, Schulman AH, Mayer KFX, Platzer M, Close TJ, Scholz U, Hansson M, Zhang G, Braumann I, Spannagl M, Li C, Waugh R, Stein N (2017) A chromosome conformation capture ordered sequence of the barley genome. Nature 544(7651):427–433. https://doi.org/10.1038/nature22043
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Mark D, DePristo MA (2010) The genome analysis toolkit, a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303. https://doi.org/10.1101/gr.107524.110
Milner SG, Jost M, Taketa S, Mazón ER, Himmelbach A, Oppermann M, Weise S, Knüpffer H, Basterrechea M, König P, Schüler D, Sharma R, Pasam RK, Rutten T, Guo G, Xu D, Zhang J, Herren G, Müller T, Krattinger SG, Keller B, Jiang Y, González MY, Zhao Y, Habekuß A, Färber S, Ordon F, Lange M, Börner A, Graner A, Reif JC, Scholz U, Mascher M, Stein N (2019) Genebank genomics highlights the diversity of a global barley collection. Nat Genet 51(2):319. https://doi.org/10.1038/s41588-018-0266-x
Momen M, Mehrgardi AA, Roudbar MA, Kranis A, Pinto RM, Valente BD, Morota G, GianolaD RGJM (2018) Including phenotypic causal networks in genome-wide association studies using mixed effects structural equation models. Front Genet 9:455. https://doi.org/10.3389/fgene.2018.00455
Momen M, Campbell MT, Walia H, Morota G (2019) Utilizing trait networks and structural equation models as tools to interpret multi-trait genome-wide association studies. Plant Methods 15(1):107. https://doi.org/10.1186/s13007-019-0493-x
Palla L, Dudbridge F (2015) A fast method that uses polygenic scores to estimate the variance explained by genome-wide marker panels and the proportion of variants affecting a trait. Am J Human Genet 97(2):250–259. https://doi.org/10.1016/j.ajhg.2015.06.005
Pegolo S, Momen M, Morota G, Rosa GJ, Gianola D, Bittante G, Cecchinato A (2020) Structural equation modeling for investigating multi-trait genetic architecture of udder health in dairy cattle. Sci Rep 10(1):1–15. https://doi.org/10.1038/s41598-020-64575-3
Piepho HP, Möhring J, Melchinger AE, Büchse A (2008) BLUP for phenotypic selection in plant breeding and variety testing. Euphytica 161:209–228. https://doi.org/10.1007/s10681-007-9449-8
R Core Team (2018) R: A language and environment for statistical computing (R Foundation for Statistical Computing).
Sadras VO (2007) Evolutionary aspects of the trade- off between seed size and number in crops. Field Crop Res 100(2–3):125–138. https://doi.org/10.1016/j.fcr.2006.07.004
Saltz JB, Hessel FC, Kelly MW (2017) Trait correlations in the genomics era. Trend Ecol Evol 32(4):279–290. https://doi.org/10.1016/j.tree.2016.12.008
Shaaf S, Bretani G, Biswas A, Fontana IM, Rossini L (2019) Genetics of barley tiller and leaf development. J Integr Plant Biol 61(3):226–256. https://doi.org/10.1111/jipb.12757
Shi H, Kichaev G, Pasaniuc B (2016) Contrasting the genetic architecture of 30 complex traits from summary association data. Am J Human Genet 99(1):139–153. https://doi.org/10.1016/j.ajhg.2016.05.013
Simmonds NW (1995) The relation between yield and protein in cereal grain. J Sci Food Agr 67(3):309–315. https://doi.org/10.1002/jsfa.2740670306
Steiger JH (2007) Understanding the limitations of global fit assessment in structural equation modelling. Personal Individ Differ 42(5):893–898. https://doi.org/10.1016/j.paid.2006.09.017
Tabachnick BG, Fidell LS (2007) Using multivariate statistics, 5th edn. Allyn and Bacon, New York
Thorwarth P, Liu G, Ebmeyer E, Schacht J, Schachschneider R, Kazman E, Reif JC, Würschum T, Longin CF (2019) Dissecting the genetics underlying the relationship between protein content and grain yield in a large hybrid wheat population. Theor Appl Genet 132(2):489–500. https://doi.org/10.1007/s00122-018-3236-x
Valente BD, Rosa GJ, Gianola D, Wu XL, Weigel K (2013) Is structural equation modeling advantageous for the genetic improvement of multiple traits? Genetics 194(3):561–572. https://doi.org/10.1534/genetics.113.151209
Velazco JG, Jordan DR, Mace ES, Hunt CH, Malosetti M, van Eeuwijk FA (2019) Genomic prediction of grain yield and drought-adaptation capacity in sorghum is enhanced by multi-trait analysis. Front Plant Sci. https://doi.org/10.3389/fpls.2019.00997
Wang X, Xu Y, Hu Z, Xu C (2018) Genomic selection methods for crop improvement: current status and prospects. Crop J 6(4):330–340. https://doi.org/10.1016/j.cj.2018.03.001
Wenzl P, Carling J, Kudrna D, Jaccoud D, Huttner E, Kleinhofs A, Kilian A (2004) Diversity Arrays Technology (DArT) for whole-genome profiling of barley. Proc Natl Acad Sci USA 101(26):9915–9920. https://doi.org/10.1073/pnas.0401076101
Wiegmann M, Maurer A, Pham A, March TJ, Al-Abdallat A, Thomas WT, Bull HJ, Shahid M, Eglinton J, Baum M, Flavell AJ (2019) Barley yield formation under abiotic stress depends on the interplay between flowering time genes and environmental cues. Sci Rep 9(1):6397. https://doi.org/10.1038/s41598-019-42673-1
Wray NR, Yang J, Hayes BJ, Price AL, Goddard ME, Visscher PM (2013) Pitfalls of predicting complex traits from SNPs. Nat Rev Genet 14(7):507–515. https://doi.org/10.1038/nrg3457
Wright S (1921) Correlation and causation. J Agr Res 20(7):557–585
Xiao Y, Liu H, Wu L, Warburton M, Yan J (2017) Genome-wide association studies in maize praise and stargaze. Mol Plant 10(3):359–374. https://doi.org/10.1016/j.molp.2016.12.008
Yang J, Bakshi A, Zhu Z, Hemani G, Vinkhuyzen AAE, Lee SH, Robinson MR, Perry JRB, Nolte IM, van Vliet-Ostaptchouk JV, Snieder H, Esko T, Milani L, Mägi R, Metspalu A, Hamsten A, Magnusson PK, Pedersen NL, Ingelsson E, Soranzo N, Keller MC, Wray NR, Goddard ME, Visscher PM, LifeLines Cohort Study (2015) Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat Genet 47(10):1114–1120. https://doi.org/10.1038/ng.3390
Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, Goddard ME, Visscher PM (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42(7):565–569. https://doi.org/10.1038/ng.608
Yang J, Lee SH, Goddard ME, Visscher PM (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Human Genet 88(1):76–82. https://doi.org/10.1016/j.ajhg.2010.11.011
Yang L, Hu H, Zhu B, Jin X, Wu F, Zhang G (2014) Genotypic variations of nitrogen use efficiency in Tibetan wild and cultivated barleys. J Zhejiang Univ 40(2):155–164
Zadoks JC, Chang TT, Konzak CF (1974) A decimal code for the growth stages of cereals. Weed Res 14(6):415–421. https://doi.org/10.1111/j.1365-3180.1974.tb01084.x
Acknowledgements
We thank Ms Lee-Anne McFawn, Ms Jenifer Bussanich and Mr David Farleigh from DPIRD (South Perth, WA) for providing technical assistance in the field trials. We thank Dr Andrew Grotzinger (University of Texas at Austin) and Dr Michel Nivard (Vrije Universiteit Amsterdam) for tailoring the scripts of the R package “GenomicSEM” to the barley genome, and for their advice in running the package. The authors declare no conflict of interest.
Funding
This work was supported by funding from the Grains Research and Development Corporation (GRDC) of Australia (DAW00240/UMU00050, UMU00049 and DAW00233), Department of Primary Industries and Regional Development (DPIRD), and Western Australian State Agricultural Biotechnology Centre (SABC).
Author information
Authors and Affiliations
Contributions
CL and TA conceived the project, collected the barley accessions, conducted initial phenological evaluation, and selected the accessions for this study. TH developed the models, analysed the data, and wrote the paper with input from CL. TA, XZ, HL, YW, SK, GZ, SW and CL conducted the field experiments and phenotyping. KC conducted the analysis of the agronomic data. CH, XZ, PW, GZ, and CT prepared the low-coverage WGS and DArTseq data. CL supervised the project.
Corresponding author
Ethics declarations
Conflicts of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Communicated by Martin Boer.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
He, T., Angessa, T.T., Hill, C.B. et al. Genomic structural equation modelling provides a whole-system approach for the future crop breeding. Theor Appl Genet 134, 2875–2889 (2021). https://doi.org/10.1007/s00122-021-03865-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00122-021-03865-4

