Skip to main content
Log in

Genome-wide analysis of deletions in maize population reveals abundant genetic diversity and functional impact

  • Original Article
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

Key message

Two read depth methods were jointly used in next-generation sequencing data to identify deletions in maize population. GWAS by deletions were analyzed for gene expression pattern and classical traits, respectively.

Abstract

Many studies have confirmed that structural variation (SV) is pervasive throughout the maize genome. Deletion is one type of SV that may impact gene expression and cause phenotypic changes in quantitative traits. In this study, two read count approaches were used to analyze the deletions in the whole-genome sequencing data of 270 maize inbred lines. A total of 19,754 deletion windows overlapped 12,751 genes, which were unevenly distributed across the genome. The deletions explained population structure well and correlated with genomic features. The deletion proportion of genes was determined to be negatively correlated with its expression. The detection of gene expression quantitative trait loci (eQTL) indicated that local eQTL were fewer but had larger effects than distant ones. The common associated genes were related to basic metabolic processes, whereas unique associated genes with eQTL played a role in the stress or stimulus responses in multiple tissues. Compared with the eQTL detected by SNPs derived from the same sequencing data, 89.4% of the associated genes could be detected by both markers. The effect of top eQTL detected by SNPs was usually larger than that detected by deletions for the same gene. A genome-wide association study (GWAS) on flowering time and plant height illustrated that only a few loci could be consistently captured by SNPs, suggesting that combining deletion and SNP for GWAS was an excellent strategy to dissect trait architecture. Our findings will provide insights into characteristic and biological function of genome-wide deletions in maize.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

The sequencing data of 270 inbred lines are available from NCBI SRA PRJNA389800 and can be also downloaded from CyVerse Data Store: /iplant/home/shared/panzea/raw_seq_282/bam/ (Bukowski et al. 2017). The VCF files of SNP genotyping data with AGPv3 coordinates can be downloaded in the directory: /iplant/home/shared/commons_repo/curated/Qi_Sun_Zea_mays_haplotype_map_2018/282_onHmp321. Sequencing data of RNA-seq have been deposited in the Sequence Read Archive under accession number SRP115041 and in BioProject under accession number PRJNA383416 (Kremling et al. 2018).

References

  • Albert FW, Kruglyak L (2015) The role of regulatory variation in complex traits and disease. Nat Rev Genet 16:197

    CAS  PubMed  Google Scholar 

  • Alonge M, Wang X, Benoit M, Soyk S, Pereira L, Zhang L, Suresh H, Ramakrishnan S, Maumus F, Ciren D, Levy Y, Harel TH, Shalev-Schlosser G, Amsellem Z, Razifard H, Caicedo AL, Tieman DM, Klee H, Kirsche M, Aganezov S, Ranallo-Benavidez TR, Lemmon ZH, Kim J, Robitaille G, Kramer M, Goodwin S, McCombie WR, Hutton S, Van Eck J, Gillis J, Eshed Y, Sedlazeck FJ, van der Knaap E, Schatz MC, Lippman ZB (2020) Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182:145-161.e123

    CAS  PubMed  PubMed Central  Google Scholar 

  • Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23:2633–2635

    CAS  PubMed  Google Scholar 

  • Buckler ES, Holland JB, Bradbury PJ, Acharya CB, Brown PJ, Browne C, Ersoz E, Flint-Garcia S, Garcia A, Glaubitz JC, Goodman MM, Harjes C, Guill K, Kroon DE, Larsson S, Lepak NK, Li H, Mitchell SE, Pressoir G, Peiffer JA, Rosas MO, Rocheford TR, Romay MC, Romero S, Salvo S, Villeda HS, Sofia da Silva H, Sun Q, Tian F, Upadyayula N, Ware D, Yates H, Yu J, Zhang Z, Kresovich S, McMullen MD (2009) The genetic architecture of maize flowering time. Science 325:714

    CAS  PubMed  Google Scholar 

  • Bukowski R, Guo X, Lu Y, Zou C, He B, Rong Z, Wang B, Xu D, Yang B, Xie C, Fan L, Gao S, Xu X, Zhang G, Li Y, Jiao Y, Doebley JF, Ross-Ibarra J, Lorant A, Buffalo V, Romay MC, Buckler ES, Ware D, Lai J, Sun Q, Xu Y (2017) Construction of the third-generation Zea mays haplotype map. GigaScience 7

  • Castelletti S, Tuberosa R, Pindo M, Salvi S (2014) A MITE Transposon Insertion Is Associated with Differential Methylation at the Maize Flowering Time QTL Vgt1. G3 Genes Genom Genet, 4:805–812

  • Chiang C, Scott AJ, Davis JR, Tsang EK, Li X, Kim Y, Hadzic T, Damani FN, Ganel L, Montgomery SB, Battle A, Conrad DF, Hall IM, Consortium GT (2017) The impact of structural variation on human gene expression. Nat Genet 49:692–699

    CAS  PubMed  PubMed Central  Google Scholar 

  • Consortium GP (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56

    Google Scholar 

  • Cook DE, Lee TG, Guo X, Melito S, Wang K, Bayless AM, Wang J, Hughes TJ, Willis DK, Clemente TE, Diers BW, Jiang J, Hudson ME, Bent AF (2012) Copy number variation of multiple genes at Rhg1; mediates nematode resistance in Soybean. Science 338:1206

    CAS  PubMed  Google Scholar 

  • Cremer T, Cremer M, Dietzel S, Müller S, Solovei I, Fakan S (2006) Chromosome territories—A functional nuclear landscape. Curr Opin Cell Biol 18:307–316

    CAS  PubMed  Google Scholar 

  • Della Coletta R, Qiu Y, Ou S, Hufford MB, Hirsch CN (2021) How the pan-genome is changing crop genomics and improvement. Genome Biol 22:3

    PubMed  PubMed Central  Google Scholar 

  • Díaz A, Zikhali M, Turner AS, Isaac P, Laurie DA (2012) Copy number variation affecting the photoperiod-B1 and vernalization-A1 genes is associated with altered flowering time in wheat (Triticum aestivum). Plos One 7:e33234

    PubMed  PubMed Central  Google Scholar 

  • Dolatabadian A, Patel DA, Edwards D, Batley J (2017) Copy number variation and disease resistance in plants. Theor Appl Genet 130:2479–2490

    CAS  PubMed  Google Scholar 

  • Fan K-H, Devos KM, Schliekelman P (2020) Strategies for eQTL mapping in allopolyploid organisms. Theor Appl Genet 133:2477–2497

    CAS  PubMed  Google Scholar 

  • Feuk L, Carson AR, Scherer SW (2006) Structural variation in the human genome. Nat Rev Genet 7:85–97

    CAS  PubMed  Google Scholar 

  • Flint-Garcia SA, Thuillet A-C, Yu J, Pressoir G, Romero SM, Mitchell SE, Doebley J, Kresovich S, Goodman MM, Buckler ES (2005) Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J 44:1054–1064

    CAS  PubMed  Google Scholar 

  • Fraser P, Bickmore W (2007) Nuclear organization of the genome and the potential for gene regulation. Nature 447:413

    CAS  PubMed  Google Scholar 

  • Fu J, Cheng Y, Linghu J, Yang X, Kang L, Zhang Z, Zhang J, He C, Du X, Peng Z, Wang B, Zhai L, Dai C, Xu J, Wang W, Li X, Zheng J, Chen L, Luo L, Liu J, Qian X, Yan J, Wang J, Wang G (2013) RNA sequencing reveals the complex regulatory network in the maize kernel. Nat Commun 4:2832

    PubMed  Google Scholar 

  • Gabur I, Chawla HS, Snowdon RJ, Parkin IAP (2019) Connecting genome structural variation with complex traits in crop plants. Theor Appl Genet 132:733–750

    PubMed  Google Scholar 

  • Gamazon ER, Stranger BE (2015) The impact of human copy number variation on gene expression. Brief Funct Genomics 14:352–357

    CAS  PubMed  PubMed Central  Google Scholar 

  • Golicz AA, Batley J, Edwards D (2016) Towards plant pangenomics. Plant Biotechnol J 14:1099–1105

    PubMed  Google Scholar 

  • Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, Catano G, Nibbs RJ, Freedman BI, Quinones MP, Bamshad MJ (2005) The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307:1434

    CAS  PubMed  Google Scholar 

  • Gore MA, Chia J-M, Elshire RJ, Sun Q, Ersoz ES, Hurwitz BL, Peiffer JA, McMullen MD, Grills GS, Ross-Ibarra J, Ware DH, Buckler ES (2009) A first-generation haplotype map of maize. Science 326:1115–1117

    CAS  PubMed  Google Scholar 

  • Ha G, Roth A, Lai D, Bashashati A, Ding J, Goya R, Giuliany R, Rosner J, Oloumi A, Shumansky K (2012) Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome Res 22:1995–2007

    CAS  PubMed  PubMed Central  Google Scholar 

  • Handsaker RE, Van Doren V, Berman JR, Genovese G, Kashin S, Boettger LM, McCarroll SA (2015) Large multiallelic copy number variations in humans. Nat Genet 47:296

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hansen BG, Halkier BA, Kliebenstein DJ (2008) Identifying the molecular basis of QTLs: eQTLs add a new dimension. Trends Plant Sci 13:72–77

    CAS  PubMed  Google Scholar 

  • Helbig I, Mefford HC, Sharp AJ, Guipponi M, Fichera M, Franke A, Muhle H, de Kovel C, Baker C, von Spiczak S, Kron KL, Steinich I, Kleefusz-Lie AA, Leu C, Gaus V, Schmitz B, Klein KM, Reif PS, Rosenow F, Weber Y, Lerche H, Zimprich F, Urak L, Fuchs K, Feucht M, Genton P, Thomas P, Visscher F, de Haan G-J, Moller RS, Hjalgrim H, Luciano D, Wittig M, Nothnagel M, Elger CE, Nurnberg P, Romano C, Malafosse A, Koeleman BPC, Lindhout D, Stephani U, Schreiber S, Eichler EE, Sander T (2009) 15q13.3 microdeletions increase risk of idiopathic generalized epilepsy. Nat Genet 41:160–162

    CAS  PubMed  PubMed Central  Google Scholar 

  • Holloway B, Luck S, Beatty M, Rafalski JA, Li B (2011) Genome-wide expression quantitative trait loci (eQTL) analysis in maize. BMC Genom 12:336

    Google Scholar 

  • Huang C, Sun H, Xu D, Chen Q, Liang Y, Wang X, Xu G, Tian J, Wang C, Li D, Wu L, Yang X, Jin W, Doebley JF, Tian F (2018) ZmCCT9 enhances maize adaptation to higher latitudes. Proc Natl Acad Sci 115:E334

    CAS  PubMed  Google Scholar 

  • Hufford MB, Seetharam AS, Woodhouse MR, Chougule KM, Ou S, Liu J, Ricci WA, Guo T, Olson A, Qiu Y, Della Coletta R, Tittes S, Hudson AI, Marand AP, Wei S, Lu Z, Wang B, Tello-Ruiz MK, Piri RD, Wang N, Dw K, Zeng Y, O’Connor CH, Li X, Gilbert AM, Baggs E, Krasileva KV, Portwood JL, Cannon EKS, Andorf CM, Manchanda N, Snodgrass SJ, Hufnagel DE, Jiang Q, Pedersen S, Syring ML, Kudrna DA, Llaca V, Fengler K, Schmitz RJ, Ross-Ibarra J, Yu J, Gent JI, Hirsch CN, Ware D, Dawe RK (2021) De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science 373:655

    CAS  PubMed  Google Scholar 

  • Institute B (2019) “Picard Toolkit.”. GitHub Repository, http://broadinstitutegithubio/picard/

  • Jin J, Tian F, Yang D-C, Meng Y-Q, Kong L, Luo J, Gao G (2016) PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 45:D1040–D1045

    PubMed  PubMed Central  Google Scholar 

  • Kim S, Misra A (2007) SNP genotyping: technologies and biomedical applications. Annu Rev Biomed Eng 9:289–320

    CAS  PubMed  Google Scholar 

  • Knouse KA, Wu J, Amon A (2016) Assessment of megabase-scale somatic copy number variation using single-cell sequencing. Genome Res 26:376

    CAS  PubMed  PubMed Central  Google Scholar 

  • Kremling KAG, Chen S-Y, Su M-H, Lepak NK, Romay MC, Swarts KL, Lu F, Lorant A, Bradbury PJ, Buckler ES (2018) Dysregulation of expression correlates with rare-allele burden and fitness loss in maize. Nature 555:520

    CAS  PubMed  Google Scholar 

  • Kremling KAG, Diepenbrock CH, Gore MA, Buckler ES, Bandillo NB (2019) Transcriptome-Wide Association Supplements Genome-Wide Association in Zea mays. G3 Genes Genom Genet, 9:3023–3033

  • Lai J, Li R, Xu X, Jin W, Xu M, Zhao H, Xiang Z, Song W, Ying K, Zhang M, Jiao Y, Ni P, Zhang J, Li D, Guo X, Ye K, Jian M, Wang B, Zheng H, Liang H, Zhang X, Wang S, Chen S, Li J, Fu Y, Springer NM, Yang H, Wang J, Dai J, Schnable PS, Wang J (2010) Genome-wide patterns of genetic variation among elite maize inbred lines. Nat Genet 42:1027–1030

    CAS  PubMed  Google Scholar 

  • Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760

    CAS  PubMed  PubMed Central  Google Scholar 

  • Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The Sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079

    PubMed  PubMed Central  Google Scholar 

  • Li Y-x, Li C, Bradbury PJ, Liu X, Lu F, Romay CM, Glaubitz JC, Wu X, Peng B, Shi Y, Song Y, Zhang D, Buckler ES, Zhang Z, Li Y, Wang T (2016) Identification of genetic variants associated with maize flowering time using an extremely large multi-genetic background population. Plant J 86:391–402

    CAS  PubMed  Google Scholar 

  • Lin G, He C, Zheng J, Koo D-H, Le H, Zheng H, Tamang TM, Lin J, Liu Y, Zhao M, Hao Y, McFraland F, Wang B, Qin Y, Tang H, McCarty DR, Wei H, Cho M-J, Park S, Kaeppler H, Kaeppler SM, Liu Y, Springer N, Schnable PS, Wang G, White FF, Liu S (2021) Chromosome-level genome assembly of a regenerable maize inbred line A188. Genome Biol 22:175

    CAS  PubMed  PubMed Central  Google Scholar 

  • Liu H, Huang Y, Li X, Wang H, Ding Y, Kang C, Sun M, Li F, Wang J, Deng Y, Yang X, Huang X, Gao X, Yuan L, An D, Wang W, Holding DR, Wu Y (2019) High frequency DNA rearrangement at qγ27 creates a novel allele for Quality Protein Maize breeding. Commun Biol 2:460

    CAS  PubMed  PubMed Central  Google Scholar 

  • Liu H, Luo X, Niu L, Xiao Y, Chen L, Liu J, Wang X, Jin M, Li W, Zhang Q, Yan J (2017a) Distant eQTLs and non-coding sequences play critical roles in regulating gene expression and quantitative trait variation in maize. Mol Plant 10:414–426

    CAS  PubMed  Google Scholar 

  • Liu L, Du Y, Shen X, Li M, Sun W, Huang J, Liu Z, Tao Y, Zheng Y, Yan J, Zhang Z (2015) KRN4 controls quantitative variation in maize kernel row number. PLOS Genet 11:1005670

    Google Scholar 

  • Liu Q, Liu H, Gong Y, Tao Y, Jiang L, Zuo W, Yang Q, Ye J, Lai J, Wu J, Lübberstedt T, Xu M (2017b) An atypical thioredoxin imparts early resistance to sugarcane mosaic virus in maize. Mol Plant 10:483–497

    CAS  PubMed  Google Scholar 

  • Liu S, Li C, Wang H, Wang S, Yang S, Liu X, Yan J, Li B, Beatty M, Zastrow-Hayes G, Song S, Qin F (2020a) Mapping regulatory variants controlling gene expression in drought response and tolerance in maize. Genome Biol 21:163

    CAS  PubMed  PubMed Central  Google Scholar 

  • Liu Y, Du H, Li P, Shen Y, Peng H, Liu S, Zhou G-A, Zhang H, Liu Z, Shi M, Huang X, Li Y, Zhang M, Wang Z, Zhu B, Han B, Liang C, Tian Z (2020b) Pan-genome of wild and cultivated soybeans. Cell 182:162-176.e113

    CAS  PubMed  Google Scholar 

  • Lu F, Romay MC, Glaubitz JC, Bradbury PJ, Elshire RJ, Wang T, Li Y, Li Y, Semagn K, Zhang X, Hernandez AG, Mikel MA, Soifer I, Barad O, Buckler ES (2015) High-resolution genetic mapping of maize pan-genome sequence anchors. Nat Commun 6:6914

    CAS  PubMed  Google Scholar 

  • Lu Y, Shah T, Hao Z, Taba S, Zhang S, Gao S, Liu J, Cao M, Wang J, Prakash AB, Rong T, Xu Y (2011) Comparative SNP and haplotype analysis reveals a higher genetic diversity and rapider LD decay in tropical than temperate germplasm in maize. PLoS ONE 6:e24861

    CAS  PubMed  PubMed Central  Google Scholar 

  • Lye ZN, Purugganan MD (2019) Copy number variation in domestication. Trends Plant Sci 24:352–365

    CAS  PubMed  Google Scholar 

  • Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TFC, McCarroll SA, Visscher PM (2009) Finding the missing heritability of complex diseases. Nature 461:747–753

    CAS  PubMed  PubMed Central  Google Scholar 

  • Maron LG, Guimarães CT, Kirst M, Albert PS, Birchler JA, Bradbury PJ, Buckler ES, Coluccio AE, Danilova TV, Kudrna D (2013) Aluminum tolerance in maize is associated with higher MATE1 gene copy number. Proc Natl Acad Sci USA 110:5241

    CAS  PubMed  PubMed Central  Google Scholar 

  • McConnell MJ, Lindberg MR, Brennand KJ, Piper JC, Voet T, Cowing-Zitron C, Shumilina S, Lasken RS, Vermeesch JR, Hall IM, Gage FH (2013) Mosaic copy number variation in human neurons. Science 342:632

    CAS  PubMed  PubMed Central  Google Scholar 

  • Meng X, Muszynski MG, Danilevskaya ON (2011) The FT-Like ZCN8 gene functions as a floral activator and is involved in photoperiod sensitivity in maize. Plant Cell 23:942

    CAS  PubMed  PubMed Central  Google Scholar 

  • Michael TP, VanBuren R (2020) Building near-complete plant genomes. Curr Opin Plant Biol 54:26–33

    CAS  PubMed  Google Scholar 

  • Miculan M, Nelissen H, Ben Hassen M, Marroni F, Inzé D, Pè ME, Dell’Acqua M (2021) A forward genetics approach integrating genome-wide association study and expression quantitative trait locus mapping to dissect leaf development in maize (Zea mays). Plant J 107:1056–1071

    CAS  PubMed  PubMed Central  Google Scholar 

  • Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470:59–65

    CAS  PubMed  PubMed Central  Google Scholar 

  • Nuytemans K, Meeus B, Crosiers D, Brouwers N, Goossens D, Engelborghs S, Pals P, Pickut B, Van DBM, Corsmit E (2009) Relative contribution of simple mutations vs. copy number variations in five Parkinson disease genes in the Belgian population. Human Mutation 30:1054–1061

    CAS  PubMed  Google Scholar 

  • Pang J, Fu J, Zong N, Wang J, Song D, Zhang X, He C, Fang T, Zhang H, Fan Y, Wang G, Zhao J (2019) Kernel size-related genes revealed by an integrated eQTL analysis during early maize kernel development. Plant J 98:19–32

    CAS  PubMed  PubMed Central  Google Scholar 

  • Paschold A, Jia Y, Marcon C, Lund S, Larson NB, Yeh C-T, Ossowski S, Lanz C, Nettleton D, Schnable PS (2012) Complementation contributes to transcriptome complexity in maize (Zea mays L.) hybrids relative to their inbred parents. Genome Res 22:2445–2454

    CAS  PubMed  PubMed Central  Google Scholar 

  • Peiffer JA, Romay MC, Gore MA, Flintgarcia SA, Zhang Z, Millard MJ, Gardner CAC, Mcmullen MD, Holland JB, Bradbury PJ (2014) The genetic architecture of maize height. Genetics 196:1337

    CAS  PubMed  PubMed Central  Google Scholar 

  • Prasad A, Merico D, Thiruvahindrapuram B, Wei J, Lionel AC, Sato D, Rickaby J, Lu C, Szatmari P, Roberts W, Fernandez BA, Marshall CR, Hatchwell E, Eis PS, Scherer SW (2012) A Discovery Resource of Rare Copy Number Variations in Individuals with Autism Spectrum Disorder. G3: Genes Genom Genet, 2:1665–1685

  • Qian L, Voss-Fels K, Cui Y, Jan Habib U, Samans B, Obermeier C, Qian W, Snowdon Rod J (2016) Deletion of a stay-green gene associates with adaptive selection in brassica napus. Mol Plant 9:1559–1569

    CAS  PubMed  Google Scholar 

  • Qin P, Lu H, Du H, Wang H, Chen W, Chen Z, He Q, Ou S, Zhang H, Li X, Li X, Li Y, Liao Y, Gao Q, Tu B, Yuan H, Ma B, Wang Y, Qian Y, Fan S, Li W, Wang J, He M, Yin J, Li T, Jiang N, Chen X, Liang C, Li S (2021) Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184:3542-3558.e3516

    CAS  PubMed  Google Scholar 

  • Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842

    CAS  PubMed  PubMed Central  Google Scholar 

  • Reymond A, Henrichsen CN, Harewood L, Merla G (2007) Side effects of genome structural changes. Curr Opin Genet Dev 17:381–386

    CAS  PubMed  Google Scholar 

  • Rodgersmelnick E, Bradbury PJ, Elshire RJ, Glaubitz JC, Acharya CB, Mitchell SE, Li C, Li Y, Buckler ES (2015) Recombination in diverse maize is stable, predictable, and associated with genetic load. Proc Natl Acad Sci USA 112:3823–3828

    CAS  Google Scholar 

  • Romero Navarro JA, Willcox M, Burgueño J, Romay C, Swarts K, Trachsel S, Preciado E, Terron A, Delgado HV, Vidal V, Ortega A, Banda AE, Montiel NOG, Ortiz-Monasterio I, Vicente FS, Espinoza AG, Atlin G, Wenzl P, Hearne S, Buckler ES (2017) A study of allelic diversity underlying flowering-time adaptation in maize landraces. Nat Genet 49:476–480

    CAS  PubMed  Google Scholar 

  • Salvi S, Sponza G, Morgante M, Tomes D, Niu X, Fengler KA, Meeley R, Ananiev EV, Svitashev S, Bruggemann E, Li B, Hainey CF, Radovic S, Zaina G, Rafalski JA, Tingey SV, Miao G-H, Phillips RL, Tuberosa R (2007) Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize. Proc Natl Acad Sci 104:11376

    CAS  PubMed  PubMed Central  Google Scholar 

  • Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, Chen W, Yan L, Higginbotham J, Cardenas M, Waligorski J, Applebaum E, Phelps L, Falcone J, Kanchi K, Thane T, Scimone A, Thane N, Henke J, Wang T, Ruppert J, Shah N, Rotter K, Hodges J, Ingenthron E, Cordes M, Kohlberg S, Sgro J, Delgado B, Mead K, Chinwalla A, Leonard S, Crouse K, Collura K, Kudrna D, Currie J, He R, Angelova A, Rajasekar S, Mueller T, Lomeli R, Scara G, Ko A, Delaney K, Wissotski M, Lopez G, Campos D, Braidotti M, Ashley E, Golser W, Kim H, Lee S, Lin J, Dujmic Z, Kim W, Talag J, Zuccolo A, Fan C, Sebastian A, Kramer M, Spiegel L, Nascimento L, Zutavern T, Miller B, Ambroise C, Muller S, Spooner W, Narechania A, Ren L, Wei S, Kumari S, Faga B, Levy MJ, McMahan L, Van Buren P, Vaughn MW, Ying K, Yeh C-T, Emrich SJ, Jia Y, Kalyanaraman A, Hsia A-P, Barbazuk WB, Baucom RS, Brutnell TP, Carpita NC, Chaparro C, Chia J-M, Deragon J-M, Estill JC, Fu Y, Jeddeloh JA, Han Y, Lee H, Li P, Lisch DR, Liu S, Liu Z, Nagel DH, McCann MC, SanMiguel P, Myers AM, Nettleton D, Nguyen J, Penning BW, Ponnala L, Schneider KL, Schwartz DC, Sharma A, Soderlund C, Springer NM, Sun Q, Wang H, Waterman M, Westerman R, Wolfgruber TK, Yang L, Yu Y, Zhang L, Zhou S, Zhu Q, Bennetzen JL, Dawe RK, Jiang J, Jiang N, Presting GG, Wessler SR, Aluru S, Martienssen RA, Clifton SW, McCombie WR, Wing RA, Wilson RK (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115

    CAS  PubMed  Google Scholar 

  • Schneider KL, Xie Z, Wolfgruber TK, Presting GG (2016) Inbreeding drives maize centromere evolution. Proc Natl Acad Sci 113:E987

    CAS  PubMed  PubMed Central  Google Scholar 

  • Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, Schatz MC (2018) Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 15:461–468

    CAS  PubMed  PubMed Central  Google Scholar 

  • Springer NM, Ying K, Fu Y, Ji T, Yeh CT, Jia Y, Wu W, Richmond T, Kitzman J, Rosenbaum H (2009) Maize inbreds exhibit high levels of copy number variation (CNV) and presence/absence variation (PAV) in genome content. Plos Genet 5:e1000734

    PubMed  PubMed Central  Google Scholar 

  • Stegle O, Parts L, Durbin R, Winn J (2010) A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLOS Comput Biol 6:e1000770

    PubMed  PubMed Central  Google Scholar 

  • Studer A, Zhao Q, Ross-Ibarra J, Doebley J (2011) Identification of a functional transposon insertion in the maize domestication gene tb1. Nat Genet 43:1160

    CAS  PubMed  PubMed Central  Google Scholar 

  • Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz HY (2015) An integrated map of structural variation in 2,504 human genomes. Nature 526:75

    CAS  PubMed  PubMed Central  Google Scholar 

  • Sun S, Zhou Y, Chen J, Shi J, Zhao H, Zhao H, Song W, Zhang M, Cui Y, Dong X, Liu H, Ma X, Jiao Y, Wang B, Wei X, Stein JC, Glaubitz JC, Lu F, Yu G, Liang C, Fengler K, Li B, Rafalski A, Schnable PS, Ware DH, Buckler ES, Lai J (2018) Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes. Nat Genet 50:1289–1295

    CAS  PubMed  Google Scholar 

  • Swansonwagner RA, Eichten SR, Kumari S, Tiffin P, Stein JC, Ware D, Springer NM (2010) Pervasive gene content variation and copy number variation in maize and its undomesticated progenitor. Genome Res 20:1689

    CAS  Google Scholar 

  • Tian T, Liu Y, Yan H, You Q, Yi X, Du Z, Xu W, Su Z (2017) agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update. Nucleic Acids Res, 45

  • Tranchant‐Dubreuil C, Rouard M, Sabot F (2018) Plant pangenome: impacts on phenotypes and evolution. Annual Plant Reviews online:1–25

  • Wang X, Chen Q, Wu Y, Lemmon ZH, Xu G, Huang C, Liang Y, Xu D, Li D, Doebley JF, Tian F (2018) Genome-wide analysis of transcriptional variability in a large maize-teosinte population. Mol Plant 11:443–459

    CAS  PubMed  Google Scholar 

  • Wolfgruber TK, Sharma A, Schneider KL, Albert PS, Koo D-H, Shi J, Gao Z, Han F, Lee H, Xu R, Allison J, Birchler JA, Jiang J, Dawe RK, Presting GG (2009) Maize centromere structure and evolution: sequence analysis of centromeres 2 and 5 reveals dynamic loci shaped primarily by retrotransposons. PLOS Genet 5:e1000743

    PubMed  PubMed Central  Google Scholar 

  • Xie C, Mao X, Huang J, Ding Y, Wu J, Dong S, Kong L, Gao G, Li C-Y, Wei L (2011) KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res 39:W316–W322

    CAS  PubMed  PubMed Central  Google Scholar 

  • Yan J, Shah T, Warburton ML, Buckler ES, McMullen MD, Crouch J (2009) Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP markers. PLoS ONE 4:e8451

    PubMed  PubMed Central  Google Scholar 

  • Yang L, Liu B, Huang B, Deng J, Li H, Yu B, Qiu F, Cheng M, Wang H, Yang R, Yang X, Zhou Y, Lu J (2013a) A functional copy number variation in the WWOX gene is associated with lung cancer risk in Chinese. Hum Mol Genet 22:1886–1894

    CAS  PubMed  Google Scholar 

  • Yang N, Liu J, Gao Q, Gui S, Chen L, Yang L, Huang J, Deng T, Luo J, He L, Wang Y, Xu P, Peng Y, Shi Z, Lan L, Ma Z, Yang X, Zhang Q, Bai M, Li S, Li W, Liu L, Jackson D, Yan J (2019) Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement. Nat Genet 51:1052–1059

    CAS  PubMed  Google Scholar 

  • Yang Q, Li Z, Li W, Ku L, Wang C, Ye J, Li K, Yang N, Li Y, Zhong T, Li J, Chen Y, Yan J, Yang X, Xu M (2013b) CACTA-like transposable element in ZmCCT attenuated photoperiod sensitivity and accelerated the postdomestication spread of maize. Proc Natl Acad Sci 110:16969

    CAS  PubMed  PubMed Central  Google Scholar 

  • Yuan Y, Bayer PE, Batley J, Edwards D (2021) Current status of structural variation studies in plants. Plant Biotechnol J. https://doi.org/10.1111/pbi.13646

    Article  PubMed  PubMed Central  Google Scholar 

  • Zhang X, Zhang H, Li L, Lan H, Ren Z, Liu D, Wu L, Liu H, Jaqueth J, Li B, Pan G, Gao S (2016) Characterizing the population structure and genetic diversity of maize breeding germplasm in Southwest China using genome-wide SNP markers. BMC Genom 17:697

    Google Scholar 

  • Zmienko A, Marszalek-Zenczak M, Wojciechowski P, Samelak-Czajka A, Luczak M, Kozlowski P, Karlowski WM, Figlerowicz M (2020) AthCNV: a map of DNA copy number variations in the arabidopsis genome[OPEN]. Plant Cell 32:1797–1819

    CAS  PubMed  PubMed Central  Google Scholar 

  • Zuo W, Chao Q, Zhang N, Ye J, Tan G, Li B, Xing Y, Zhang B, Liu H, Fengler KA, Zhao J, Zhao X, Chen Y, Lai J, Yan J, Xu M (2014) A maize wall-associated kinase confers quantitative resistance to head smut. Nat Genet 47:151

    PubMed  Google Scholar 

Download references

Acknowledgements

XZ thanks Sara Miller, Xiaolei Liu, Dong Zhang, and Tao Zuo for the help of working in the Buckler lab and gratefully acknowledges China Scholarship Council (CSC) for financial support during studying in USA. The authors also thank Dan Liu, Ling Wu, and Bowen Luo for the help on analyzing data and Duojiang Gao and Shiqiang Gao for the help on data collection in Sichuan Agricultural University of China.

Funding

This work was supported by Sichuan Science and Technology Support Project 2021YFYZ0027, 2021YFYZ0020, and 2021YFFZ0017, China Agriculture Research System of MOF and MARA, the National Natural Science Foundation of China (31971955), and the US Department of Agriculture–Agricultural Research Service and the National Science Foundation grant IOS-1238014 to E.S.B.

Author information

Authors and Affiliations

Authors

Contributions

XZ performed bioinformatics analysis and wrote the manuscript. YZ performed bioinformatics analysis and review the manuscript. KK participated in large-scale sample collection, RNA-seq and expression data production, and revised the manuscript. MCR managed the field work, collected tissues for next-generation sequencing of the population, and reviewed the manuscript. RB and QS performed raw sequencing data processing and data management. SG, ESB, and FL helped in manuscript discussion and writing. FL and XZ conceived the project. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Xiao Zhang or Fei Lu.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Communicated by Thomas Lubberstedt.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

122_2021_3965_MOESM1_ESM.pdf

Fig. S1 Correlation between read depth (q > 30) and average read count, average copy number per window. a. Correlation between read depth (q > 30) and average read count per window in HMMCopy method. b. Correlation between read depth (q > 30) and average read count per window in dynamic window method. c. Correlation between read depth (q > 30) and average copy number per window in HMMCopy method. d. Correlation between read depth (q > 30) and average copy number per window in dynamic window method (PDF 455 kb)

Fig. S2 Frequency of BUSCOs overlapped and non-overlapped deletions (PDF 12 kb)

122_2021_3965_MOESM3_ESM.pdf

Fig. S3 Deletion allele frequency and inbred line frequency for deletion windows in this population. a. Deletion allele frequency distribution of inbred lines. b. Inbred line frequency for deletion windows (PDF 39 kb)

122_2021_3965_MOESM4_ESM.pdf

Fig. S4 Comparison of polymorphism SNP and deletion allele frequency derived from B73. a. Proportion of the polymorphism SNPs and deletions according to different allele frequencies derived from B73. b. Correlation between allele frequency of the polymorphism deletion and mean SNP frequency derived from B73 within the deletion window (PDF 1890 kb)

122_2021_3965_MOESM5_ESM.pdf

Fig. S5 A simulation of PCA analysis by different deletion window numbers. PCA analysis using 5000 (a), 10,000 (b), 20,000 (c), 30,000 (d), 40,000 (e), 100,000 (f) randomly selected windows (PDF 1182 kb)

122_2021_3965_MOESM6_ESM.pdf

Fig. S6 A simulation of PCA analysis by different SNP numbers. PCA analysis using 5000 (a), 10,000 (b), 20,000(c), 30,000 (d), 40,000 (e), 100,000 (f) randomly selected SNPs (PDF 1074 kb)

122_2021_3965_MOESM7_ESM.pdf

Fig. S7 Linkage disequilibrium decay distance of deletions in the population and the different subgroups. a. Linkage disequilibrium decay distance in the population. b. Linkage disequilibrium decay distance in the NSS group. c. Linkage disequilibrium decay distance in the SS group. d. Linkage disequilibrium decay distance in the TS group. The dashed lines on each graph mean the r2 threshold for the LD decay distance (PDF 210 kb)

122_2021_3965_MOESM8_ESM.pdf

Fig. S8 Correlation matrix of deletion frequency and six genomic features (Recombination, Sites of GERP > 2, Sites of GERP > 0, Gene density, Repeat and Centromere distance). Pairwise P value and correlation coefficient are labeled in this graph (PDF 3291 kb)

122_2021_3965_MOESM9_ESM.pdf

Fig. S9 Correlation of expression rank and mean deletion proportion in seven tissues. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN (PDF 241 kb)

122_2021_3965_MOESM10_ESM.pdf

Fig. S10 The relative distance between eQTL and associated genes analyzed by deletions. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN (PDF 462 kb)

122_2021_3965_MOESM11_ESM.pdf

Fig. S11 Distribution of eQTL hotspots in total genome. The color gradient means gene number located in each hotspot (PDF 167 kb)

122_2021_3965_MOESM12_ESM.pdf

Fig. S12 Significantly enriched GO of genes with eQTL identified by deletions. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN (PDF 325 kb)

122_2021_3965_MOESM13_ESM.pdf

Fig. S13 Significantly enriched GO of unique genes with eQTL identified by deletion in each tissue. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN (PDF 357 kb)

122_2021_3965_MOESM14_ESM.pdf

Fig. S14 The number of significantly associated genes detected in paired relevant tissues. a. GShoot and GRoot. b. LMAN and LMAD. c. L3Base and L3Tip. The pie plots below are the proportion of shared eQTL and unique eQTL detected in paired relevant tissues (PDF 245 kb)

122_2021_3965_MOESM15_ESM.pdf

Fig. S15 The relative distance between eQTL and associated genes analyzed by SNPs. a. GRoot. b. GShoot. c. Kern. d. L3Base. e. L3Tip. f. LMAD. g. LMAN. h. Total (PDF 641 kb)

122_2021_3965_MOESM16_ESM.pdf

Fig. S16 Comparison of effect (R2) between distant eQTL and local eQTL detected by SNPs in different tissues. ** means they show very significant difference at P < 0.01 (PDF 1796 kb)

122_2021_3965_MOESM17_ESM.pdf

Fig. S17 Comparison of effect (R2) of top eQTL between deletion (DEL) and SNP for the same genes. a. Distant eQTL. b. Local eQTL. c. Total eQTL (PDF 524 kb)

122_2021_3965_MOESM18_ESM.pdf

Fig. S18 Manhattan plots of GWAS result using deletion alleles. a. DTA. b. DTS. c. EHdivPH. d. GDD_DTA. e. GDD_DTS (PDF 2452 kb)

122_2021_3965_MOESM19_ESM.pdf

Fig. S19 QQ plots of GWAS for five traits using deletion alleles generated in FarmCPU. a. DTA. b. DTS. c. EHdivPH. d. GDD_DTA. e. GDD_DTS (PDF 1044 kb)

122_2021_3965_MOESM20_ESM.pdf

Fig. S20 Manhattan plots of GWAS result using SNP alleles. a. ASI. b. DTA. c. DTS. d. EH. e. EHdivPH. f. GDD_ASI. g. GDD_DTA. h. GDD_DTS. i. PH (PDF 2811 kb)

122_2021_3965_MOESM21_ESM.pdf

Fig. S21 QQ plots of GWAS for nine traits using SNP alleles generated in FarmCPU. a. ASI. b. DTA. c. DTS. d. EH. e. EHdivPH. f. GDD_ASI. g. GDD_DTA. h. GDD_DTS. i. (PDF 1659 kb)

122_2021_3965_MOESM22_ESM.pdf

Fig. S22 Comparison of GWAS significant sites between deletion and SNP in flowering time and plant height trait (PDF 16 kb)

122_2021_3965_MOESM23_ESM.xlsx

Table S1 List of pedigree, sequencing depth, mean read length, and the suitable window size of inbred lines in the population (XLSX 28 kb)

Table S2 PAV length of CML247, Mo17, and W22 relative to B73 (DOCX 16 kb)

Table S3 PAV length of B73 relative to CML247, Mo17, and W22 (DOCX 15 kb)

Table S4 List of deletion windows enriched more than 95% inbred lines in this population (XLSX 10 kb)

Table S5 List of deletion regions that included more than 20 continuous deletion windows (XLSX 11 kb)

Table S6 Multiple comparison of deletion frequency in different groups (DOCX 15 kb)

Table S7 Descriptive statistics of deletion frequency for different group (DOCX 15 kb)

122_2021_3965_MOESM30_ESM.xlsx

Table S8 List of genes show significant differences of expression in inbred lines between deletion overlap or not overlap. Data in this table are the P values, only significant genes in seven tissues are included. NS means ‘not significant’ (XLSX 68 kb)

Table S9 Summary of eQTL detected by deletions in seven tissues (DOCX 16 kb)

Table S10 List of deletion windows enriched more than ten eQTL in seven tissues (XLSX 12 kb)

Table S11 KEGG pathway enrichment analysis of genes extracted from distant eQTL hotspots (XLSX 11 kb)

Table S12 List of 301 genes consistently associated with eQTL in all seven tissues (XLSX 20 kb)

Table S13 Summary of eQTL detected by SNPs in seven tissues (DOCX 16 kb)

Table S14 List of unique eQTL consistently detected by deletions and SNPs inside the deletion window (XLSX 14 kb)

122_2021_3965_MOESM37_ESM.xlsx

Table S15 Significant deletion alleles and associated gene information. Left and right boundaries are determined by LD decay (XLSX 13 kb)

122_2021_3965_MOESM38_ESM.xlsx

Table S16 Significant SNP alleles and associated gene information. Left and right boundaries are determined by LD decay (XLSX 14 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Zhu, Y., Kremling, K.A.G. et al. Genome-wide analysis of deletions in maize population reveals abundant genetic diversity and functional impact. Theor Appl Genet 135, 273–290 (2022). https://doi.org/10.1007/s00122-021-03965-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-021-03965-1

Navigation