Skip to main content
Log in

Resequencing 93 accessions of coffee unveils independent and parallel selection during Coffea species divergence

  • Published:
Plant Molecular Biology Aims and scope Submit manuscript

Abstract

Coffea arabica, C. canephora and C. excelsa, with differentiated morphological traits and distinct agro-climatic conditions, compose the majority of the global coffee plantation. To comprehensively understand their genetic diversity and divergence for future genetic improvement requires high-density markers. Here, we sequenced 93 accessions encompassing these three Coffea species, uncovering 15,367,960 single-nucleotide polymorphisms (SNPs). These SNPs are unequally distributed across different genomic regions and gene families, with two disease-resistant gene families showing the highest SNP density, suggesting strong balancing selection. Meanwhile, the allotetraploid C. arabica exhibits greater nucleotide diversity, followed by C. canephora and C. excelsa. Population divergence (FST), population stratification and phylogeny all support strong divergence among species, with C. arabica and its parental species C. canephora being closer genetically. Scanning of genomic islands with elevated FST and structure-disruptive SNPs contributing to species divergence revealed that most of the selected genes in each lineage are independent, with a few being selected in parallel for two or three species, such as genes in root hair cell development, flavonols accumulation and disease-resistant genes. Moreover, some of the SNPs associated with coffee lipids exhibit significantly biased allele frequency among species, being valuable for interspecific breeding. Overall, our study not only uncovers the key population genomic patterns among species but also contributes a substantial genomic resource for coffee breeding.

Key message

Whole-genome resequencing of 93 coffee accessions unveils diversity and genetic relationship of three important Coffea species. Independent and parallel selection of genes are identified during the three species divergence

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

All Illumina sequencing data is submitted to NCBI and is available under Accession Number PRJNA505204 in SRA database.

References

  • Aerts R, Berecha G, Gijbels P, Hundera K, Van Glabeke S, Vandepitte K, Muys B, Roldán-Ruiz I, Honnay O (2013) Genetic variation and risks of introgression in the wild Coffea arabica gene pool in south-western Ethiopian montane rainforests. Evol Appl 6(2):243–252

    PubMed  Google Scholar 

  • Afzal AJ, Wood AJ, Lightfoot DA (2008) Plant receptor-like serine threonine kinases: roles in signaling and plant defense. Mol Plant Microbe Interact 21(5):507–517

    CAS  PubMed  Google Scholar 

  • Aga E, Bekele E, Bryngelsson T (2005) Inter-simple sequence repeat (ISSR) variation in forest coffee trees (Coffea arabica L.) populations from Ethiopia. Genetica 124:213–221

    CAS  PubMed  Google Scholar 

  • Ananda Kumar S, Sudisha J, Sreenath HL (2008) Genetic relation of Coffea and Indian Psilanthus species as revealed through RAPD and ISSR markers. Int J Integr Biol 3:150–158

    Google Scholar 

  • Anderson EC, Ng TC, Crandall ED, Garza JC (2017) Genetic and individual assignment of tetraploid green sturgeon with SNP assay data. Conserv Genet 18(5):1119–1130

    CAS  Google Scholar 

  • Baba SA, Vishwakarma RA, Ashraf N (2017) Functional characterization of CsBGlu12, a β-glucosidase from crocus sativus, provides insights into its role in abiotic stress through accumulation of antioxidant flavonols. J Biol Chem 3:150–158

    Google Scholar 

  • Beaumont MA (2005) Adaptation and speciation: what can FST tell us? Trends Ecol Evol 20(8):0–440

    Google Scholar 

  • Bikard D, Patel D, Le Metté C, Giorgi V, Camilleri C, Bennett MJ, Loudet O (2009) Divergent evolution of duplicate genes leads to genetic incompatibilities within A. thaliana. Science 323(5914):623–626

    CAS  PubMed  Google Scholar 

  • Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff . Fly 6(2):80–92

    CAS  PubMed  PubMed Central  Google Scholar 

  • Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21(18):3674–3676

    CAS  PubMed  Google Scholar 

  • Coulibaly I, Noirot M, Lorieux M, Charrier A, Hamon S, Louarn J (2002) Introgression of self-compatibility from Coffea heterocalyx to the cultivated species Coffea canephora. Theor Appl Genet 105(6–7):994–999

    CAS  PubMed  Google Scholar 

  • Da Silva EAA, Toorop PE, Van Aelst AC, Hilhorst HWM (2004) Abscisic acid controls embryo growth potential and endosperm cap weakening during coffee (Coffea arabica cv. Rubi) seed germination. Planta 220(2):251–261

    PubMed  Google Scholar 

  • Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R (2011) The variant call format and VCFtools. Bioinformatics 21(18):3674–3676

    Google Scholar 

  • Denoeud F, Carretero-Paulet L, Dereeper A, Droc G, Guyot R, Pietrella M, Zheng C, Alberti A, Anthony F, Aprea G, Aury JM, Bento P, Bernard M, Bocs S, Campa C, Cenci A, Combes MC, Crouzillat D, Da Silva C, Daddiego L, De Bellis F, Dussert S, Garsmeur O, Gayraud T, Guignon V, Jahn K, Jamilloux V, Joët T, Labadie K, Lan T, Leclercq J, Lepelley M, Leroy T, Li LT, Librado P, Lopez L, Muñoz A, Noel B, Pallavicini A, Perrotta G, Poncet V, Pot D, Priyono, Rigoreau M, Rouard M, Rozas J, Tranchant-Dubreuil C, VanBuren R, Zhang Q, Andrade AC, Argout X, Bertrand B, De Kochko A, Graziosi G, Henry RJ, Jayarama, Ming R, Nagai C, Rounsley S, Sankoff D, Giuliano G, Albert VA, Wincker P, Lashermes P (2014) The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345(6201):1181–1184

    CAS  PubMed  Google Scholar 

  • DeYoung BJ, Innes RW (2006) Plant NBS-LRR proteins in pathogen sensing and host defense. Nat Immunol 7(12):1243

    CAS  PubMed  PubMed Central  Google Scholar 

  • Ekblaw WE, Ukers WH (1935) All about coffee. Library of Alexandria, Alexandria

    Google Scholar 

  • Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer ELL, Tate J, Punta M (2014) Pfam: the protein families database. Nucleic Acids Res 42(Database issue):222–230

    Google Scholar 

  • Frichot E, François O (2015) LEA: an R package for landscape and ecological association studies. Methods Ecol Evol 6(8):925–929

    Google Scholar 

  • Garavito A, Montagnon C, Guyot R, Bertrand B (2016) Identification by the DArTseq method of the genetic origin of the Coffea canephora cultivated in Vietnam and Mexico. BMC Plant Biol 16(1):242

    PubMed  PubMed Central  Google Scholar 

  • Hamon P, Grover CE, Davis AP, Rakotomalala JJ, Raharimalala NE, Albert VA, Sreenath HL, Stoffelen P, Mitchell SE, Couturon E, Hamon S, de Kochko A, Crouzillat D, Rigoreau M, Sumirat U, Akaffou S, Guyot R (2017) Genotyping-by-sequencing provides the first well-resolved phylogeny for coffee (Coffea) and insights into the evolution of caffeine content in its species: GBS coffee phylogeny and the evolution of caffeine content. Mol Phylogenet Evol 109:351–361

    CAS  PubMed  Google Scholar 

  • Hardigan MA, Laimbeer FPE, Newton L, Crisovan E, Hamilton JP, Vaillancourt B, Wiegert-Rininger K, Wood JC, Douches DS, Farré EM, Veilleux RE, Buell CR (2017) Genome diversity of tuber-bearing Solanum uncovers complex evolutionary history and targets of domestication in the cultivated potato. Proc Natl Acad Sci USA 114(46):E9999–E10008

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hazzouri KM, Flowers JM, Visser HJ, Khierallah HSM, Rosas U, Pham GM, Meyer RS, Johansen CK, Fresquez ZA, Masmoudi K, Haider N, El Kadri N, Idaghdour Y, Malek JA, Thirkhill D, Markhand GS, Krueger RR, Zaid A, Purugganan MD (2015) Whole genome re-sequencing of date palms yields insights into diversification of a fruit tree crop. Nat Commun 6:8824

    CAS  PubMed  Google Scholar 

  • Lashermes P, Andrzejewski S, Bertrand B, Combes M-C, Dussert S, Graziosi G, Trouslot P, Anthony F (2000) Molecular analysis of introgressive breeding in coffee (Coffea arabica L.). Theor Appl Genet 100(1):139–146

    CAS  Google Scholar 

  • Lehti-Shiu MD, Shiu SH (2012) Diversity, classification and function of the plant protein kinase superfamily. Philos Trans R Soc B Biol Sci 367(1602):2619

    CAS  Google Scholar 

  • Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 26:589–595

    CAS  Google Scholar 

  • Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079

    PubMed  PubMed Central  Google Scholar 

  • Liu Y, Liu Y, Huang H (2010) Genetic variation and natural hybridization among sympatric Actinidia species and the implications for introgression breeding of kiwifruit. Tree Genet Genomes 6(5):801–813

    Google Scholar 

  • Masumbuko LI, Bryngelsson T (2006) Inter simple sequence repeat (ISSR) analysis of diploid coffee species and cultivated Coffea arabica L. from Tanzania. Genet Resour Crop Evol 53(2):357–366

    CAS  Google Scholar 

  • McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303

    CAS  PubMed  PubMed Central  Google Scholar 

  • Merot-L’anthoene V, Tournebize R, Darracq O, Rattina V, Lepelley M, Bellanger L, Tranchant‐Dubreuil C, Coulée M, Pégard M, Metairon S, Fournier C, Stoffelen P, Janssens SB, Kiwuka C, Musoli P, Sumirat U, Legnaté H, Kambale J, Neto J, Revel C, de Kochko A, Descombes P, Crouzillat D, Poncet V (2018) Ferreira da Costa. Plant Biotechnol J. https://doi.org/10.1111/pbi.13066

    Article  Google Scholar 

  • Mishra MK, Nishani S, Jayarama (2011) Molecular identification and genetic relationships among coffee species (Coffea L.) inferred from issr and srap marker analyses. Arch Biol Sci 63(3):667–679

    Google Scholar 

  • Mizuta Y, Harushima Y, Kurata N (2010) Rice pollen hybrid incompatibility caused by reciprocal gene loss of duplicated genes. Proc Natl Acad Sci USA 107(47):20417–20422

    CAS  PubMed  PubMed Central  Google Scholar 

  • Mizutani M (2012) Impacts of diversification of cytochrome P450 on plant metabolism. Biol Pharm Bull 35(6):824–832

    CAS  PubMed  Google Scholar 

  • Noir S, Anthony F, Bertrand B, Combes MC, Lashermes P (2003) Identification of a major gene (Mex-1) from Coffea canephora conferring resistance to Meloidogyne exigua in Coffea arabica. Plant Pathol 52(1):97–103

    CAS  Google Scholar 

  • Oliveira HR, Hagenblad J, Leino MW, Leigh FJ, Lister DL, Penã-Chocarro L, Jones MK (2014) Wheat in the Mediterranean revisited: tetraploid wheat landraces assessed with elite bread wheat single nucleotide polymorphism markers. BMC Genet 15(1):54

    PubMed  PubMed Central  Google Scholar 

  • Owuor JBO (1985) Interspecific hybridization between Coffea arabica L. and tetraploid C. canephora P. Ex Fr. II. Meiosis in F1 hybrids and back crosses to C. Arabica. Euphytica 30(3):861–866

    Google Scholar 

  • Panchy N, Lehti-Shiu MD, Shiu S-H (2016) Evolution of gene duplication in plants. Plant Physiol 171(4):2294–2316

    CAS  PubMed  PubMed Central  Google Scholar 

  • Pandey S, Nelson DC, Assmann SM (2009) Two novel GPCR-type G proteins are abscisic acid receptors in Arabidopsis. Cell 136(1):0–148

    CAS  Google Scholar 

  • Perrois C, Strickler SR, Mathieu G, Lepelley M, Bedon L, Michaux S, Husson J, Mueller L, Privat I (2014) Differential regulation of caffeine metabolism in Coffea arabica (Arabica) and Coffea canephora (Robusta). Planta 241(1):179–191

    PubMed  PubMed Central  Google Scholar 

  • Price MN, Dehal PS, Arkin AP (2010) FastTree 2: approximately maximum-likelihood trees for large alignments. PLoS ONE 5(5):e9490

    PubMed  PubMed Central  Google Scholar 

  • R Development Core Team (2011) R: a language and environment for statistical computing. R Development Core Team, Vienna, pp 12–21

    Google Scholar 

  • Ramu P, Esuma W, Kawuki R, Rabbi IY, Egesi C, Bredeson JV, Bart RS, Verma J, Buckler ES, Lu F (2017) Cassava haplotype map highlights fixation of deleterious mutations during clonal propagation. Nat Genet 49(6):959–963

    CAS  PubMed  Google Scholar 

  • Renaut S, Grassa CJ, Yeaman S, Moyers BT, Lai Z, Kane NC, Bowers JE, Burke JM, Rieseberg LH (2013) Genomic islands of divergence are not affected by geography of speciation in sunflowers. Nat Commun 4:1827

    CAS  PubMed  Google Scholar 

  • Sant’Ana GC, Pereira LFP, Pot D, Ivamoto ST, Domingues DS, Ferreira RV, Pagiatto NF, Da Silva BSR, Nogueira LM, Kitzberger CSG, Scholz MBS, De Oliveira FF, Sera GH, Padilha L, Labouisse JP, Guyot R, Charmetant P, Leroy T (2018) Genome-wide association study reveals candidate genes influencing lipids and diterpenes contents in Coffea arabica L. Sci Rep 8(1):1–12

    Google Scholar 

  • Schuler MA, Werck-Reichhart D (2003) Functional genomics of P450s. Annu Rev Plant Biol 54(1):629–667

    CAS  PubMed  Google Scholar 

  • Smýkal P, Coyne CJ, Ambrose MJ, Maxted N, Schaefer H, Blair MW, Berger J, Greene SL, Nelson MN, Besharat N, Vymyslický T, Toker C, Saxena RK, Roorkiwal M, Pandey MK, Hu J, Li YH, Wang LX, Guo Y, Qiu LJ, Redden RJ, Varshney RK (2015) Legume crops phylogeny and genetic diversity for science and breeding. CRC Crit Rev Plant Sci 34(1–3):43–104

    Google Scholar 

  • Surya Prakash N, Combes MC, Somanna N, Lashermes P (2002) AFLP analysis of introgression in coffee cultivars (Coffea arabica L.) derived from a natural interspecific hybrid. Euphytica 124:265–271

    CAS  Google Scholar 

  • Sutherland BL, Galloway LF (2017) Postzygotic isolation varies by ploidy level within a polyploid complex. New Phytol 213(1):404–412

    CAS  PubMed  Google Scholar 

  • Tran HTM, Ramaraj T, Furtado A, Lee LS, Henry RJ (2018) Use of a draft genome of coffee (Coffea arabica) to identify SNPs associated with caffeine content. Plant Biotechnol J 16(10):1756–1766

    CAS  PubMed  PubMed Central  Google Scholar 

  • van der Vossen H, Bertrand B, Charrier A (2015) Next generation variety development for sustainable production of arabica coffee (Coffea arabica L.): a review. Euphytica 204(2):243–256

    Google Scholar 

  • Wang M, Yu Y, Haberer G, Marri PR, Fan C, Goicoechea JL, Zuccolo A, Song X, Kudrna D, Ammiraju JSS, Cossu RM, Maldonado C, Chen J, Lee S, Sisneros N, de Baynast K, Golser W, Wissotski M, Kim W, Sanchez P, Ndjiondjop M-N, Sanni K, Long M, Carney J, Panaud O, Wicker T, Machado CA, Chen M, Mayer KFX, Rounsley S, Wing RA (2014) The genome sequence of African rice (Oryza glaberrima) and evidence for independent domestication. Nat Genet 46:982–988

    CAS  PubMed  PubMed Central  Google Scholar 

  • Wang J, Street NR, Scofield DG, Ingvarsson PK (2016) Variation in linked selection and recombination drive genomic divergence during allopatric speciation of European and American aspens. Mol Biol Evol 33:1754–1767

    CAS  PubMed  PubMed Central  Google Scholar 

  • Wang M, Tu L, Lin M, Lin Z, Wang P, Yang Q, Ye Z, Shen C, Li J, Zhang L, Zhou X, Nie X, Li Z, Guo K, Ma Y, Huang C, Jin S, Zhu L, Yang X, Min L, Yuan D, Zhang Q, Lindsey K, Zhang X (2017) Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat Genet 49(4):579

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank all members in Yan lab for discussion and comments on the manuscript. This work was supported by the National Natural Science Foundation of China (31501364) and Hainan Provincial Natural Science Foundation of China (2018CXTD342). We thank Razgar Seyed Rahmani and Robeto Bartolome from Ghent University for improving the language of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

LH, LY conceived and led the project. LH, XW, LY developed and performed genome assembly and analysis. LH, YD, YL, CH performed genomic sequencing. LH, TS, LY wrote manuscript. TS revised the manuscript.

Corresponding authors

Correspondence to Lin Yan or Tao Shi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

11103_2020_974_MOESM1_ESM.pdf

Supplementary material 1 (PDF 3571.4 kb). Supplementary Figure S1. Eigenvalue from PC1 to PC100 in PCA study of 93 coffee individuals. Supplementary Figure S2. Approximately-Maximum-likelihood SNP tree with branch length resembling relative divergence. Supplementary Figure S3. Cross-entropy from K=1 to K=10 in population STRUCTURE analysis in LEA. Supplementary Figure S4. Venn diagram of genomic regions being selected by FST analyses. Supplementary Figure S5. Venn diagram of GO terms of genes being selected by FST analyses. Supplementary Figure S6. Top 20 enriched GO terms of commonly selected genes. Supplementary Figure S7. Top 20 enriched GO terms of selected genes between C. arabica and C. canephora using SNPs of disruptive impact. Supplementary Figure S8. Top 20 enriched GO terms of selected genes between C. arabica and C. excelsa using SNPs of disruptive impact. Supplementary Figure S9. Top 20 enriched GO terms of selected genes between C. canephora and C. excelsa using SNPs of disruptive impact. Supplementary Figure S10. Proportions of gene members selected by FST analysis based on SNPs of disruptive impact for pairwise comparsions of all three Coffea species.

11103_2020_974_MOESM2_ESM.xlsx

Supplementary material 2 (XLSX 4429.7 kb). Supplementary Table S1. Sample names, country of origin, climate and traits of 93 coffee accessions. Supplementary Table S2. Mapping rate, depth, coverage, fraction of missing SNPs in 93 coffee accessions on the reference genome. Supplementary Table S3. Comparison of allele consistency with known SNPs. Supplementary Table S4. Pfam annotation of coffee reference genome. Supplementary Table S5. Gene Ontology enrichment analysis of candidate genes under selection after split of C. arabica and C. canephora. Supplementary Table S6. Gene Ontology enrichment analysis of candidate genes under selection after split of C. arabica and C. excelsa. Supplementary Table S7. Gene ontology enrichment analysis of candidate genes under selection after split of C. canephora and C. excelsa. Supplementary Table S8. FST of lipids-associated SNPs during species divergence.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, L., Wang, X., Dong, Y. et al. Resequencing 93 accessions of coffee unveils independent and parallel selection during Coffea species divergence. Plant Mol Biol 103, 51–61 (2020). https://doi.org/10.1007/s11103-020-00974-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11103-020-00974-4

Keywords

Navigation