Abstract
Key message
Two read depth methods were jointly used in next-generation sequencing data to identify deletions in maize population. GWAS by deletions were analyzed for gene expression pattern and classical traits, respectively.
Abstract
Many studies have confirmed that structural variation (SV) is pervasive throughout the maize genome. Deletion is one type of SV that may impact gene expression and cause phenotypic changes in quantitative traits. In this study, two read count approaches were used to analyze the deletions in the whole-genome sequencing data of 270 maize inbred lines. A total of 19,754 deletion windows overlapped 12,751 genes, which were unevenly distributed across the genome. The deletions explained population structure well and correlated with genomic features. The deletion proportion of genes was determined to be negatively correlated with its expression. The detection of gene expression quantitative trait loci (eQTL) indicated that local eQTL were fewer but had larger effects than distant ones. The common associated genes were related to basic metabolic processes, whereas unique associated genes with eQTL played a role in the stress or stimulus responses in multiple tissues. Compared with the eQTL detected by SNPs derived from the same sequencing data, 89.4% of the associated genes could be detected by both markers. The effect of top eQTL detected by SNPs was usually larger than that detected by deletions for the same gene. A genome-wide association study (GWAS) on flowering time and plant height illustrated that only a few loci could be consistently captured by SNPs, suggesting that combining deletion and SNP for GWAS was an excellent strategy to dissect trait architecture. Our findings will provide insights into characteristic and biological function of genome-wide deletions in maize.
Similar content being viewed by others
Data availability
The sequencing data of 270 inbred lines are available from NCBI SRA PRJNA389800 and can be also downloaded from CyVerse Data Store: /iplant/home/shared/panzea/raw_seq_282/bam/ (Bukowski et al. 2017). The VCF files of SNP genotyping data with AGPv3 coordinates can be downloaded in the directory: /iplant/home/shared/commons_repo/curated/Qi_Sun_Zea_mays_haplotype_map_2018/282_onHmp321. Sequencing data of RNA-seq have been deposited in the Sequence Read Archive under accession number SRP115041 and in BioProject under accession number PRJNA383416 (Kremling et al. 2018).
References
Albert FW, Kruglyak L (2015) The role of regulatory variation in complex traits and disease. Nat Rev Genet 16:197
Alonge M, Wang X, Benoit M, Soyk S, Pereira L, Zhang L, Suresh H, Ramakrishnan S, Maumus F, Ciren D, Levy Y, Harel TH, Shalev-Schlosser G, Amsellem Z, Razifard H, Caicedo AL, Tieman DM, Klee H, Kirsche M, Aganezov S, Ranallo-Benavidez TR, Lemmon ZH, Kim J, Robitaille G, Kramer M, Goodwin S, McCombie WR, Hutton S, Van Eck J, Gillis J, Eshed Y, Sedlazeck FJ, van der Knaap E, Schatz MC, Lippman ZB (2020) Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182:145-161.e123
Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23:2633–2635
Buckler ES, Holland JB, Bradbury PJ, Acharya CB, Brown PJ, Browne C, Ersoz E, Flint-Garcia S, Garcia A, Glaubitz JC, Goodman MM, Harjes C, Guill K, Kroon DE, Larsson S, Lepak NK, Li H, Mitchell SE, Pressoir G, Peiffer JA, Rosas MO, Rocheford TR, Romay MC, Romero S, Salvo S, Villeda HS, Sofia da Silva H, Sun Q, Tian F, Upadyayula N, Ware D, Yates H, Yu J, Zhang Z, Kresovich S, McMullen MD (2009) The genetic architecture of maize flowering time. Science 325:714
Bukowski R, Guo X, Lu Y, Zou C, He B, Rong Z, Wang B, Xu D, Yang B, Xie C, Fan L, Gao S, Xu X, Zhang G, Li Y, Jiao Y, Doebley JF, Ross-Ibarra J, Lorant A, Buffalo V, Romay MC, Buckler ES, Ware D, Lai J, Sun Q, Xu Y (2017) Construction of the third-generation Zea mays haplotype map. GigaScience 7
Castelletti S, Tuberosa R, Pindo M, Salvi S (2014) A MITE Transposon Insertion Is Associated with Differential Methylation at the Maize Flowering Time QTL Vgt1. G3 Genes Genom Genet, 4:805–812
Chiang C, Scott AJ, Davis JR, Tsang EK, Li X, Kim Y, Hadzic T, Damani FN, Ganel L, Montgomery SB, Battle A, Conrad DF, Hall IM, Consortium GT (2017) The impact of structural variation on human gene expression. Nat Genet 49:692–699
Consortium GP (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56
Cook DE, Lee TG, Guo X, Melito S, Wang K, Bayless AM, Wang J, Hughes TJ, Willis DK, Clemente TE, Diers BW, Jiang J, Hudson ME, Bent AF (2012) Copy number variation of multiple genes at Rhg1; mediates nematode resistance in Soybean. Science 338:1206
Cremer T, Cremer M, Dietzel S, Müller S, Solovei I, Fakan S (2006) Chromosome territories—A functional nuclear landscape. Curr Opin Cell Biol 18:307–316
Della Coletta R, Qiu Y, Ou S, Hufford MB, Hirsch CN (2021) How the pan-genome is changing crop genomics and improvement. Genome Biol 22:3
Díaz A, Zikhali M, Turner AS, Isaac P, Laurie DA (2012) Copy number variation affecting the photoperiod-B1 and vernalization-A1 genes is associated with altered flowering time in wheat (Triticum aestivum). Plos One 7:e33234
Dolatabadian A, Patel DA, Edwards D, Batley J (2017) Copy number variation and disease resistance in plants. Theor Appl Genet 130:2479–2490
Fan K-H, Devos KM, Schliekelman P (2020) Strategies for eQTL mapping in allopolyploid organisms. Theor Appl Genet 133:2477–2497
Feuk L, Carson AR, Scherer SW (2006) Structural variation in the human genome. Nat Rev Genet 7:85–97
Flint-Garcia SA, Thuillet A-C, Yu J, Pressoir G, Romero SM, Mitchell SE, Doebley J, Kresovich S, Goodman MM, Buckler ES (2005) Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J 44:1054–1064
Fraser P, Bickmore W (2007) Nuclear organization of the genome and the potential for gene regulation. Nature 447:413
Fu J, Cheng Y, Linghu J, Yang X, Kang L, Zhang Z, Zhang J, He C, Du X, Peng Z, Wang B, Zhai L, Dai C, Xu J, Wang W, Li X, Zheng J, Chen L, Luo L, Liu J, Qian X, Yan J, Wang J, Wang G (2013) RNA sequencing reveals the complex regulatory network in the maize kernel. Nat Commun 4:2832
Gabur I, Chawla HS, Snowdon RJ, Parkin IAP (2019) Connecting genome structural variation with complex traits in crop plants. Theor Appl Genet 132:733–750
Gamazon ER, Stranger BE (2015) The impact of human copy number variation on gene expression. Brief Funct Genomics 14:352–357
Golicz AA, Batley J, Edwards D (2016) Towards plant pangenomics. Plant Biotechnol J 14:1099–1105
Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, Catano G, Nibbs RJ, Freedman BI, Quinones MP, Bamshad MJ (2005) The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307:1434
Gore MA, Chia J-M, Elshire RJ, Sun Q, Ersoz ES, Hurwitz BL, Peiffer JA, McMullen MD, Grills GS, Ross-Ibarra J, Ware DH, Buckler ES (2009) A first-generation haplotype map of maize. Science 326:1115–1117
Ha G, Roth A, Lai D, Bashashati A, Ding J, Goya R, Giuliany R, Rosner J, Oloumi A, Shumansky K (2012) Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome Res 22:1995–2007
Handsaker RE, Van Doren V, Berman JR, Genovese G, Kashin S, Boettger LM, McCarroll SA (2015) Large multiallelic copy number variations in humans. Nat Genet 47:296
Hansen BG, Halkier BA, Kliebenstein DJ (2008) Identifying the molecular basis of QTLs: eQTLs add a new dimension. Trends Plant Sci 13:72–77
Helbig I, Mefford HC, Sharp AJ, Guipponi M, Fichera M, Franke A, Muhle H, de Kovel C, Baker C, von Spiczak S, Kron KL, Steinich I, Kleefusz-Lie AA, Leu C, Gaus V, Schmitz B, Klein KM, Reif PS, Rosenow F, Weber Y, Lerche H, Zimprich F, Urak L, Fuchs K, Feucht M, Genton P, Thomas P, Visscher F, de Haan G-J, Moller RS, Hjalgrim H, Luciano D, Wittig M, Nothnagel M, Elger CE, Nurnberg P, Romano C, Malafosse A, Koeleman BPC, Lindhout D, Stephani U, Schreiber S, Eichler EE, Sander T (2009) 15q13.3 microdeletions increase risk of idiopathic generalized epilepsy. Nat Genet 41:160–162
Holloway B, Luck S, Beatty M, Rafalski JA, Li B (2011) Genome-wide expression quantitative trait loci (eQTL) analysis in maize. BMC Genom 12:336
Huang C, Sun H, Xu D, Chen Q, Liang Y, Wang X, Xu G, Tian J, Wang C, Li D, Wu L, Yang X, Jin W, Doebley JF, Tian F (2018) ZmCCT9 enhances maize adaptation to higher latitudes. Proc Natl Acad Sci 115:E334
Hufford MB, Seetharam AS, Woodhouse MR, Chougule KM, Ou S, Liu J, Ricci WA, Guo T, Olson A, Qiu Y, Della Coletta R, Tittes S, Hudson AI, Marand AP, Wei S, Lu Z, Wang B, Tello-Ruiz MK, Piri RD, Wang N, Dw K, Zeng Y, O’Connor CH, Li X, Gilbert AM, Baggs E, Krasileva KV, Portwood JL, Cannon EKS, Andorf CM, Manchanda N, Snodgrass SJ, Hufnagel DE, Jiang Q, Pedersen S, Syring ML, Kudrna DA, Llaca V, Fengler K, Schmitz RJ, Ross-Ibarra J, Yu J, Gent JI, Hirsch CN, Ware D, Dawe RK (2021) De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science 373:655
Institute B (2019) “Picard Toolkit.”. GitHub Repository, http://broadinstitutegithubio/picard/
Jin J, Tian F, Yang D-C, Meng Y-Q, Kong L, Luo J, Gao G (2016) PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 45:D1040–D1045
Kim S, Misra A (2007) SNP genotyping: technologies and biomedical applications. Annu Rev Biomed Eng 9:289–320
Knouse KA, Wu J, Amon A (2016) Assessment of megabase-scale somatic copy number variation using single-cell sequencing. Genome Res 26:376
Kremling KAG, Chen S-Y, Su M-H, Lepak NK, Romay MC, Swarts KL, Lu F, Lorant A, Bradbury PJ, Buckler ES (2018) Dysregulation of expression correlates with rare-allele burden and fitness loss in maize. Nature 555:520
Kremling KAG, Diepenbrock CH, Gore MA, Buckler ES, Bandillo NB (2019) Transcriptome-Wide Association Supplements Genome-Wide Association in Zea mays. G3 Genes Genom Genet, 9:3023–3033
Lai J, Li R, Xu X, Jin W, Xu M, Zhao H, Xiang Z, Song W, Ying K, Zhang M, Jiao Y, Ni P, Zhang J, Li D, Guo X, Ye K, Jian M, Wang B, Zheng H, Liang H, Zhang X, Wang S, Chen S, Li J, Fu Y, Springer NM, Yang H, Wang J, Dai J, Schnable PS, Wang J (2010) Genome-wide patterns of genetic variation among elite maize inbred lines. Nat Genet 42:1027–1030
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The Sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Li Y-x, Li C, Bradbury PJ, Liu X, Lu F, Romay CM, Glaubitz JC, Wu X, Peng B, Shi Y, Song Y, Zhang D, Buckler ES, Zhang Z, Li Y, Wang T (2016) Identification of genetic variants associated with maize flowering time using an extremely large multi-genetic background population. Plant J 86:391–402
Lin G, He C, Zheng J, Koo D-H, Le H, Zheng H, Tamang TM, Lin J, Liu Y, Zhao M, Hao Y, McFraland F, Wang B, Qin Y, Tang H, McCarty DR, Wei H, Cho M-J, Park S, Kaeppler H, Kaeppler SM, Liu Y, Springer N, Schnable PS, Wang G, White FF, Liu S (2021) Chromosome-level genome assembly of a regenerable maize inbred line A188. Genome Biol 22:175
Liu H, Huang Y, Li X, Wang H, Ding Y, Kang C, Sun M, Li F, Wang J, Deng Y, Yang X, Huang X, Gao X, Yuan L, An D, Wang W, Holding DR, Wu Y (2019) High frequency DNA rearrangement at qγ27 creates a novel allele for Quality Protein Maize breeding. Commun Biol 2:460
Liu H, Luo X, Niu L, Xiao Y, Chen L, Liu J, Wang X, Jin M, Li W, Zhang Q, Yan J (2017a) Distant eQTLs and non-coding sequences play critical roles in regulating gene expression and quantitative trait variation in maize. Mol Plant 10:414–426
Liu L, Du Y, Shen X, Li M, Sun W, Huang J, Liu Z, Tao Y, Zheng Y, Yan J, Zhang Z (2015) KRN4 controls quantitative variation in maize kernel row number. PLOS Genet 11:1005670
Liu Q, Liu H, Gong Y, Tao Y, Jiang L, Zuo W, Yang Q, Ye J, Lai J, Wu J, Lübberstedt T, Xu M (2017b) An atypical thioredoxin imparts early resistance to sugarcane mosaic virus in maize. Mol Plant 10:483–497
Liu S, Li C, Wang H, Wang S, Yang S, Liu X, Yan J, Li B, Beatty M, Zastrow-Hayes G, Song S, Qin F (2020a) Mapping regulatory variants controlling gene expression in drought response and tolerance in maize. Genome Biol 21:163
Liu Y, Du H, Li P, Shen Y, Peng H, Liu S, Zhou G-A, Zhang H, Liu Z, Shi M, Huang X, Li Y, Zhang M, Wang Z, Zhu B, Han B, Liang C, Tian Z (2020b) Pan-genome of wild and cultivated soybeans. Cell 182:162-176.e113
Lu F, Romay MC, Glaubitz JC, Bradbury PJ, Elshire RJ, Wang T, Li Y, Li Y, Semagn K, Zhang X, Hernandez AG, Mikel MA, Soifer I, Barad O, Buckler ES (2015) High-resolution genetic mapping of maize pan-genome sequence anchors. Nat Commun 6:6914
Lu Y, Shah T, Hao Z, Taba S, Zhang S, Gao S, Liu J, Cao M, Wang J, Prakash AB, Rong T, Xu Y (2011) Comparative SNP and haplotype analysis reveals a higher genetic diversity and rapider LD decay in tropical than temperate germplasm in maize. PLoS ONE 6:e24861
Lye ZN, Purugganan MD (2019) Copy number variation in domestication. Trends Plant Sci 24:352–365
Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TFC, McCarroll SA, Visscher PM (2009) Finding the missing heritability of complex diseases. Nature 461:747–753
Maron LG, Guimarães CT, Kirst M, Albert PS, Birchler JA, Bradbury PJ, Buckler ES, Coluccio AE, Danilova TV, Kudrna D (2013) Aluminum tolerance in maize is associated with higher MATE1 gene copy number. Proc Natl Acad Sci USA 110:5241
McConnell MJ, Lindberg MR, Brennand KJ, Piper JC, Voet T, Cowing-Zitron C, Shumilina S, Lasken RS, Vermeesch JR, Hall IM, Gage FH (2013) Mosaic copy number variation in human neurons. Science 342:632
Meng X, Muszynski MG, Danilevskaya ON (2011) The FT-Like ZCN8 gene functions as a floral activator and is involved in photoperiod sensitivity in maize. Plant Cell 23:942
Michael TP, VanBuren R (2020) Building near-complete plant genomes. Curr Opin Plant Biol 54:26–33
Miculan M, Nelissen H, Ben Hassen M, Marroni F, Inzé D, Pè ME, Dell’Acqua M (2021) A forward genetics approach integrating genome-wide association study and expression quantitative trait locus mapping to dissect leaf development in maize (Zea mays). Plant J 107:1056–1071
Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470:59–65
Nuytemans K, Meeus B, Crosiers D, Brouwers N, Goossens D, Engelborghs S, Pals P, Pickut B, Van DBM, Corsmit E (2009) Relative contribution of simple mutations vs. copy number variations in five Parkinson disease genes in the Belgian population. Human Mutation 30:1054–1061
Pang J, Fu J, Zong N, Wang J, Song D, Zhang X, He C, Fang T, Zhang H, Fan Y, Wang G, Zhao J (2019) Kernel size-related genes revealed by an integrated eQTL analysis during early maize kernel development. Plant J 98:19–32
Paschold A, Jia Y, Marcon C, Lund S, Larson NB, Yeh C-T, Ossowski S, Lanz C, Nettleton D, Schnable PS (2012) Complementation contributes to transcriptome complexity in maize (Zea mays L.) hybrids relative to their inbred parents. Genome Res 22:2445–2454
Peiffer JA, Romay MC, Gore MA, Flintgarcia SA, Zhang Z, Millard MJ, Gardner CAC, Mcmullen MD, Holland JB, Bradbury PJ (2014) The genetic architecture of maize height. Genetics 196:1337
Prasad A, Merico D, Thiruvahindrapuram B, Wei J, Lionel AC, Sato D, Rickaby J, Lu C, Szatmari P, Roberts W, Fernandez BA, Marshall CR, Hatchwell E, Eis PS, Scherer SW (2012) A Discovery Resource of Rare Copy Number Variations in Individuals with Autism Spectrum Disorder. G3: Genes Genom Genet, 2:1665–1685
Qian L, Voss-Fels K, Cui Y, Jan Habib U, Samans B, Obermeier C, Qian W, Snowdon Rod J (2016) Deletion of a stay-green gene associates with adaptive selection in brassica napus. Mol Plant 9:1559–1569
Qin P, Lu H, Du H, Wang H, Chen W, Chen Z, He Q, Ou S, Zhang H, Li X, Li X, Li Y, Liao Y, Gao Q, Tu B, Yuan H, Ma B, Wang Y, Qian Y, Fan S, Li W, Wang J, He M, Yin J, Li T, Jiang N, Chen X, Liang C, Li S (2021) Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184:3542-3558.e3516
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842
Reymond A, Henrichsen CN, Harewood L, Merla G (2007) Side effects of genome structural changes. Curr Opin Genet Dev 17:381–386
Rodgersmelnick E, Bradbury PJ, Elshire RJ, Glaubitz JC, Acharya CB, Mitchell SE, Li C, Li Y, Buckler ES (2015) Recombination in diverse maize is stable, predictable, and associated with genetic load. Proc Natl Acad Sci USA 112:3823–3828
Romero Navarro JA, Willcox M, Burgueño J, Romay C, Swarts K, Trachsel S, Preciado E, Terron A, Delgado HV, Vidal V, Ortega A, Banda AE, Montiel NOG, Ortiz-Monasterio I, Vicente FS, Espinoza AG, Atlin G, Wenzl P, Hearne S, Buckler ES (2017) A study of allelic diversity underlying flowering-time adaptation in maize landraces. Nat Genet 49:476–480
Salvi S, Sponza G, Morgante M, Tomes D, Niu X, Fengler KA, Meeley R, Ananiev EV, Svitashev S, Bruggemann E, Li B, Hainey CF, Radovic S, Zaina G, Rafalski JA, Tingey SV, Miao G-H, Phillips RL, Tuberosa R (2007) Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize. Proc Natl Acad Sci 104:11376
Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, Chen W, Yan L, Higginbotham J, Cardenas M, Waligorski J, Applebaum E, Phelps L, Falcone J, Kanchi K, Thane T, Scimone A, Thane N, Henke J, Wang T, Ruppert J, Shah N, Rotter K, Hodges J, Ingenthron E, Cordes M, Kohlberg S, Sgro J, Delgado B, Mead K, Chinwalla A, Leonard S, Crouse K, Collura K, Kudrna D, Currie J, He R, Angelova A, Rajasekar S, Mueller T, Lomeli R, Scara G, Ko A, Delaney K, Wissotski M, Lopez G, Campos D, Braidotti M, Ashley E, Golser W, Kim H, Lee S, Lin J, Dujmic Z, Kim W, Talag J, Zuccolo A, Fan C, Sebastian A, Kramer M, Spiegel L, Nascimento L, Zutavern T, Miller B, Ambroise C, Muller S, Spooner W, Narechania A, Ren L, Wei S, Kumari S, Faga B, Levy MJ, McMahan L, Van Buren P, Vaughn MW, Ying K, Yeh C-T, Emrich SJ, Jia Y, Kalyanaraman A, Hsia A-P, Barbazuk WB, Baucom RS, Brutnell TP, Carpita NC, Chaparro C, Chia J-M, Deragon J-M, Estill JC, Fu Y, Jeddeloh JA, Han Y, Lee H, Li P, Lisch DR, Liu S, Liu Z, Nagel DH, McCann MC, SanMiguel P, Myers AM, Nettleton D, Nguyen J, Penning BW, Ponnala L, Schneider KL, Schwartz DC, Sharma A, Soderlund C, Springer NM, Sun Q, Wang H, Waterman M, Westerman R, Wolfgruber TK, Yang L, Yu Y, Zhang L, Zhou S, Zhu Q, Bennetzen JL, Dawe RK, Jiang J, Jiang N, Presting GG, Wessler SR, Aluru S, Martienssen RA, Clifton SW, McCombie WR, Wing RA, Wilson RK (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115
Schneider KL, Xie Z, Wolfgruber TK, Presting GG (2016) Inbreeding drives maize centromere evolution. Proc Natl Acad Sci 113:E987
Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, Schatz MC (2018) Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 15:461–468
Springer NM, Ying K, Fu Y, Ji T, Yeh CT, Jia Y, Wu W, Richmond T, Kitzman J, Rosenbaum H (2009) Maize inbreds exhibit high levels of copy number variation (CNV) and presence/absence variation (PAV) in genome content. Plos Genet 5:e1000734
Stegle O, Parts L, Durbin R, Winn J (2010) A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLOS Comput Biol 6:e1000770
Studer A, Zhao Q, Ross-Ibarra J, Doebley J (2011) Identification of a functional transposon insertion in the maize domestication gene tb1. Nat Genet 43:1160
Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz HY (2015) An integrated map of structural variation in 2,504 human genomes. Nature 526:75
Sun S, Zhou Y, Chen J, Shi J, Zhao H, Zhao H, Song W, Zhang M, Cui Y, Dong X, Liu H, Ma X, Jiao Y, Wang B, Wei X, Stein JC, Glaubitz JC, Lu F, Yu G, Liang C, Fengler K, Li B, Rafalski A, Schnable PS, Ware DH, Buckler ES, Lai J (2018) Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes. Nat Genet 50:1289–1295
Swansonwagner RA, Eichten SR, Kumari S, Tiffin P, Stein JC, Ware D, Springer NM (2010) Pervasive gene content variation and copy number variation in maize and its undomesticated progenitor. Genome Res 20:1689
Tian T, Liu Y, Yan H, You Q, Yi X, Du Z, Xu W, Su Z (2017) agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update. Nucleic Acids Res, 45
Tranchant‐Dubreuil C, Rouard M, Sabot F (2018) Plant pangenome: impacts on phenotypes and evolution. Annual Plant Reviews online:1–25
Wang X, Chen Q, Wu Y, Lemmon ZH, Xu G, Huang C, Liang Y, Xu D, Li D, Doebley JF, Tian F (2018) Genome-wide analysis of transcriptional variability in a large maize-teosinte population. Mol Plant 11:443–459
Wolfgruber TK, Sharma A, Schneider KL, Albert PS, Koo D-H, Shi J, Gao Z, Han F, Lee H, Xu R, Allison J, Birchler JA, Jiang J, Dawe RK, Presting GG (2009) Maize centromere structure and evolution: sequence analysis of centromeres 2 and 5 reveals dynamic loci shaped primarily by retrotransposons. PLOS Genet 5:e1000743
Xie C, Mao X, Huang J, Ding Y, Wu J, Dong S, Kong L, Gao G, Li C-Y, Wei L (2011) KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res 39:W316–W322
Yan J, Shah T, Warburton ML, Buckler ES, McMullen MD, Crouch J (2009) Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP markers. PLoS ONE 4:e8451
Yang L, Liu B, Huang B, Deng J, Li H, Yu B, Qiu F, Cheng M, Wang H, Yang R, Yang X, Zhou Y, Lu J (2013a) A functional copy number variation in the WWOX gene is associated with lung cancer risk in Chinese. Hum Mol Genet 22:1886–1894
Yang N, Liu J, Gao Q, Gui S, Chen L, Yang L, Huang J, Deng T, Luo J, He L, Wang Y, Xu P, Peng Y, Shi Z, Lan L, Ma Z, Yang X, Zhang Q, Bai M, Li S, Li W, Liu L, Jackson D, Yan J (2019) Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement. Nat Genet 51:1052–1059
Yang Q, Li Z, Li W, Ku L, Wang C, Ye J, Li K, Yang N, Li Y, Zhong T, Li J, Chen Y, Yan J, Yang X, Xu M (2013b) CACTA-like transposable element in ZmCCT attenuated photoperiod sensitivity and accelerated the postdomestication spread of maize. Proc Natl Acad Sci 110:16969
Yuan Y, Bayer PE, Batley J, Edwards D (2021) Current status of structural variation studies in plants. Plant Biotechnol J. https://doi.org/10.1111/pbi.13646
Zhang X, Zhang H, Li L, Lan H, Ren Z, Liu D, Wu L, Liu H, Jaqueth J, Li B, Pan G, Gao S (2016) Characterizing the population structure and genetic diversity of maize breeding germplasm in Southwest China using genome-wide SNP markers. BMC Genom 17:697
Zmienko A, Marszalek-Zenczak M, Wojciechowski P, Samelak-Czajka A, Luczak M, Kozlowski P, Karlowski WM, Figlerowicz M (2020) AthCNV: a map of DNA copy number variations in the arabidopsis genome[OPEN]. Plant Cell 32:1797–1819
Zuo W, Chao Q, Zhang N, Ye J, Tan G, Li B, Xing Y, Zhang B, Liu H, Fengler KA, Zhao J, Zhao X, Chen Y, Lai J, Yan J, Xu M (2014) A maize wall-associated kinase confers quantitative resistance to head smut. Nat Genet 47:151
Acknowledgements
XZ thanks Sara Miller, Xiaolei Liu, Dong Zhang, and Tao Zuo for the help of working in the Buckler lab and gratefully acknowledges China Scholarship Council (CSC) for financial support during studying in USA. The authors also thank Dan Liu, Ling Wu, and Bowen Luo for the help on analyzing data and Duojiang Gao and Shiqiang Gao for the help on data collection in Sichuan Agricultural University of China.
Funding
This work was supported by Sichuan Science and Technology Support Project 2021YFYZ0027, 2021YFYZ0020, and 2021YFFZ0017, China Agriculture Research System of MOF and MARA, the National Natural Science Foundation of China (31971955), and the US Department of Agriculture–Agricultural Research Service and the National Science Foundation grant IOS-1238014 to E.S.B.
Author information
Authors and Affiliations
Contributions
XZ performed bioinformatics analysis and wrote the manuscript. YZ performed bioinformatics analysis and review the manuscript. KK participated in large-scale sample collection, RNA-seq and expression data production, and revised the manuscript. MCR managed the field work, collected tissues for next-generation sequencing of the population, and reviewed the manuscript. RB and QS performed raw sequencing data processing and data management. SG, ESB, and FL helped in manuscript discussion and writing. FL and XZ conceived the project. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Communicated by Thomas Lubberstedt.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
122_2021_3965_MOESM1_ESM.pdf
Fig. S1 Correlation between read depth (q > 30) and average read count, average copy number per window. a. Correlation between read depth (q > 30) and average read count per window in HMMCopy method. b. Correlation between read depth (q > 30) and average read count per window in dynamic window method. c. Correlation between read depth (q > 30) and average copy number per window in HMMCopy method. d. Correlation between read depth (q > 30) and average copy number per window in dynamic window method (PDF 455 kb)
122_2021_3965_MOESM3_ESM.pdf
Fig. S3 Deletion allele frequency and inbred line frequency for deletion windows in this population. a. Deletion allele frequency distribution of inbred lines. b. Inbred line frequency for deletion windows (PDF 39 kb)
122_2021_3965_MOESM4_ESM.pdf
Fig. S4 Comparison of polymorphism SNP and deletion allele frequency derived from B73. a. Proportion of the polymorphism SNPs and deletions according to different allele frequencies derived from B73. b. Correlation between allele frequency of the polymorphism deletion and mean SNP frequency derived from B73 within the deletion window (PDF 1890 kb)
122_2021_3965_MOESM5_ESM.pdf
Fig. S5 A simulation of PCA analysis by different deletion window numbers. PCA analysis using 5000 (a), 10,000 (b), 20,000 (c), 30,000 (d), 40,000 (e), 100,000 (f) randomly selected windows (PDF 1182 kb)
122_2021_3965_MOESM6_ESM.pdf
Fig. S6 A simulation of PCA analysis by different SNP numbers. PCA analysis using 5000 (a), 10,000 (b), 20,000(c), 30,000 (d), 40,000 (e), 100,000 (f) randomly selected SNPs (PDF 1074 kb)
122_2021_3965_MOESM7_ESM.pdf
Fig. S7 Linkage disequilibrium decay distance of deletions in the population and the different subgroups. a. Linkage disequilibrium decay distance in the population. b. Linkage disequilibrium decay distance in the NSS group. c. Linkage disequilibrium decay distance in the SS group. d. Linkage disequilibrium decay distance in the TS group. The dashed lines on each graph mean the r2 threshold for the LD decay distance (PDF 210 kb)
122_2021_3965_MOESM8_ESM.pdf
Fig. S8 Correlation matrix of deletion frequency and six genomic features (Recombination, Sites of GERP > 2, Sites of GERP > 0, Gene density, Repeat and Centromere distance). Pairwise P value and correlation coefficient are labeled in this graph (PDF 3291 kb)
122_2021_3965_MOESM9_ESM.pdf
Fig. S9 Correlation of expression rank and mean deletion proportion in seven tissues. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN (PDF 241 kb)
122_2021_3965_MOESM10_ESM.pdf
Fig. S10 The relative distance between eQTL and associated genes analyzed by deletions. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN (PDF 462 kb)
122_2021_3965_MOESM11_ESM.pdf
Fig. S11 Distribution of eQTL hotspots in total genome. The color gradient means gene number located in each hotspot (PDF 167 kb)
122_2021_3965_MOESM12_ESM.pdf
Fig. S12 Significantly enriched GO of genes with eQTL identified by deletions. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN (PDF 325 kb)
122_2021_3965_MOESM13_ESM.pdf
Fig. S13 Significantly enriched GO of unique genes with eQTL identified by deletion in each tissue. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN (PDF 357 kb)
122_2021_3965_MOESM14_ESM.pdf
Fig. S14 The number of significantly associated genes detected in paired relevant tissues. a. GShoot and GRoot. b. LMAN and LMAD. c. L3Base and L3Tip. The pie plots below are the proportion of shared eQTL and unique eQTL detected in paired relevant tissues (PDF 245 kb)
122_2021_3965_MOESM15_ESM.pdf
Fig. S15 The relative distance between eQTL and associated genes analyzed by SNPs. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN. h. Total (PDF 641 kb)
122_2021_3965_MOESM16_ESM.pdf
Fig. S16 Comparison of effect (R2) between distant eQTL and local eQTL detected by SNPs in different tissues. ** means they show very significant difference at P < 0.01 (PDF 1796 kb)
122_2021_3965_MOESM17_ESM.pdf
Fig. S17 Comparison of effect (R2) of top eQTL between deletion (DEL) and SNP for the same genes. a. Distant eQTL. b. Local eQTL. c. Total eQTL (PDF 524 kb)
122_2021_3965_MOESM18_ESM.pdf
Fig. S18 Manhattan plots of GWAS result using deletion alleles. a. DTA. b. DTS. c. EHdivPH. d. GDD_DTA. e. GDD_DTS (PDF 2452 kb)
122_2021_3965_MOESM19_ESM.pdf
Fig. S19 QQ plots of GWAS for five traits using deletion alleles generated in FarmCPU. a. DTA. b. DTS. c. EHdivPH. d. GDD_DTA. e. GDD_DTS (PDF 1044 kb)
122_2021_3965_MOESM20_ESM.pdf
Fig. S20 Manhattan plots of GWAS result using SNP alleles. a. ASI. b. DTA. c. DTS. d. EH. e. EHdivPH. f. GDD_ASI. g. GDD_DTA. h. GDD_DTS. i. PH (PDF 2811 kb)
122_2021_3965_MOESM21_ESM.pdf
Fig. S21 QQ plots of GWAS for nine traits using SNP alleles generated in FarmCPU. a. ASI. b. DTA. c. DTS. d. EH. e. EHdivPH. f. GDD_ASI. g. GDD_DTA. h. GDD_DTS. i. (PDF 1659 kb)
122_2021_3965_MOESM22_ESM.pdf
Fig. S22 Comparison of GWAS significant sites between deletion and SNP in flowering time and plant height trait (PDF 16 kb)
122_2021_3965_MOESM23_ESM.xlsx
Table S1 List of pedigree, sequencing depth, mean read length, and the suitable window size of inbred lines in the population (XLSX 28 kb)
122_2021_3965_MOESM30_ESM.xlsx
Table S8 List of genes show significant differences of expression in inbred lines between deletion overlap or not overlap. Data in this table are the P values, only significant genes in seven tissues are included. NS means ‘not significant’ (XLSX 68 kb)
122_2021_3965_MOESM37_ESM.xlsx
Table S15 Significant deletion alleles and associated gene information. Left and right boundaries are determined by LD decay (XLSX 13 kb)
122_2021_3965_MOESM38_ESM.xlsx
Table S16 Significant SNP alleles and associated gene information. Left and right boundaries are determined by LD decay (XLSX 14 kb)
Rights and permissions
About this article
Cite this article
Zhang, X., Zhu, Y., Kremling, K.A.G. et al. Genome-wide analysis of deletions in maize population reveals abundant genetic diversity and functional impact. Theor Appl Genet 135, 273–290 (2022). https://doi.org/10.1007/s00122-021-03965-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00122-021-03965-1