Skip to main content
Log in

BAC-end sequences analysis provides first insights into coffee (Coffea canephora P.) genome composition and evolution

Plant Molecular Biology Aims and scope Submit manuscript

Cite this article

Abstract

Coffee is one of the world’s most important agricultural commodities. Coffee belongs to the Rubiaceae family in the euasterid I clade of dicotyledonous plants, to which the Solanaceae family also belongs. Two bacterial artificial chromosome (BAC) libraries of a homozygous doubled haploid plant of Coffea canephora were constructed using two enzymes, HindIII and BstYI. A total of 134,827 high quality BAC-end sequences (BESs) were generated from the 73,728 clones of the two libraries, and 131,412 BESs were conserved for further analysis after elimination of chloroplast and mitochondrial sequences. This corresponded to almost 13 % of the estimated size of the C. canephora genome. 6.7 % of BESs contained simple sequence repeats, the most abundant (47.8 %) being mononucleotide motifs. These sequences allow the development of numerous useful marker sites. Potential transposable elements (TEs) represented 11.9 % of the full length BESs. A difference was observed between the BstYI and HindIII libraries (14.9 vs. 8.8 %). Analysis of BESs against known coding sequences of TEs indicated that 11.9 % of the genome corresponded to known repeat sequences, like for other flowering plants. The number of genes in the coffee genome was estimated at 41,973 which is probably overestimated. Comparative genome mapping revealed that microsynteny was higher between coffee and grapevine than between coffee and tomato or Arabidopsis. BESs constitute valuable resources for the first genome wide survey of coffee and provide new insights into the composition and evolution of the coffee genome.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402. doi:10.1093/nar/25.17.3389

    Article  PubMed  CAS  Google Scholar 

  • Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408(6814):796–815

    Article  Google Scholar 

  • Berardini TZ, Mundodi S, Reiser L, Huala E, Garcia-Hernandez M, Zhang P, Mueller LA, Yoon J, Doyle A, Lander G, Moseyko N, Yoo D, Xu I, Zoeckler B, Montoya M, Miller N, Weems D, Rhee SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol 135(2):745–755. Epub 2004 Jun 1

    Google Scholar 

  • Blanc G, Barakat A, Guyot R, Cooke R, Delseny M (2000) Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell 12:1093–1101. doi:10.2307/3871257

    PubMed  CAS  Google Scholar 

  • Cavagnaro PF, Chung SM, Szklarczyk M, Grzebelus D, Senalik D, Atkins AE, Simon PW (2009) Characterization of a deep coverage carrot (Daucus carota L.) BAC library and initial analysis of BAC-end sequences. Mol Genet Genomics 281:273–288. doi:10.1007/s00438-008-0411-9

    Article  PubMed  CAS  Google Scholar 

  • Cenci A, Combes MC, Lashermes P (2010) Comparative sequence analyses indicate that Coffea (Asterids) and Vitis (Rosids) derive from the same paleo-hexaploid ancestral genome. Mol Genet Genomics 283:493–501. doi:10.1007/s11103-011-9852-3

    Article  PubMed  CAS  Google Scholar 

  • Cenci A, Combes MC, Lashermes P (2012) Genome evolution in diploid and tetraploid Coffea species as revealed by comparative analysis of orthologous genome segments. Plant Mol Biol 78:135–145. doi:10.1007/s11103-011-9852-3

    Article  PubMed  CAS  Google Scholar 

  • Cenci A, Combes MC, Lashermes P (2013) Differences in evolution rates among eudicotiledon species observed by analysis of protein divergence. J Hered. doi:10.1093/jhered/est025

    PubMed  Google Scholar 

  • Cheng X, Xu J, Xia S, Gu J, Yang Y, Fu J, Qian X, Zhang S, Wu J, Liu K (2009) Development and genetic mapping of microsatellite markers from genome survey sequences in Brassica napus. Theor Appl Genet 118:1121–1131. doi:10.1007/s00122-009-0967-8

    Article  PubMed  CAS  Google Scholar 

  • Cheung F, Town CD (2007) A BAC-end view of the Musa acuminata genome. BMC Plant Biol 7:29. doi:10.1186/1471-2229-7-29

    Article  PubMed  Google Scholar 

  • Conesa A, Götz S (2008) Blast2GO: a comprehensive suite for functional analysis in plant genomics. Int J Plant Genomics 2008:1–13

    Article  Google Scholar 

  • Couturon E, Berthaud J (1982) Présentation d’une méthode de récupération d’haploïde spontanés découverts chez le Coffea canephora var. robusta. Café Cacao Thé 19(3):267–270

    Google Scholar 

  • Datema E, Mueller LA, Buels R, Giovannoni JJ, Visser RGF, Stiekema WJ, van Ham CHJ (2008) Comparative BAC-end sequence analysis of tomato and potato reveals overrepresentation of specific gene families in potato. BMC Plant Biol 8:34. doi:10.1186/1471-2229-8-34

    Article  PubMed  Google Scholar 

  • Davis AP, Tosh J, Ruch N, Fay MF (2011) Growing coffee: Psilanthus (Rubiaceae) subsumed on the basis of molecular and morphological data; implications for the size, morphology, distribution and evolutionary history of Coffea. Bot J Linn Soc 167(4):357–377. doi:10.1111/j.1095-8339.2011.01177.x

    Article  Google Scholar 

  • Febrer M, Cheung F, Town CD, Cannon SB, Young ND, Abberton MT, Jenkins G, Milbourne D (2007) Construction, characterization and preliminary BAC-end sequencing analysis of a bacterial artificial chromosome library of white clover (Trifolium repens L.). Genome 50:412–421. doi:10.1139/G07-013

    Article  PubMed  CAS  Google Scholar 

  • Feuillet C, Leach JE, Rogers J, Schnable PS, Eversole K (2011) Crop genome sequencing: lessons and rationales. Trends Plant Sci 16(2):77–88

    Article  PubMed  CAS  Google Scholar 

  • Frelichowski JE Jr, Palmer MB, Main D, Tomkins JP, Cantrell RG, Stelly DM, Yu J, Kohel RJ, Ulloa M (2006) Cotton genome mapping with new microsatellites from Acala ‘Maxxa’ BAC-ends. Mol Genet Genomics 275:479–491. doi:10.1007/s00438-006-0106-z

    Article  PubMed  CAS  Google Scholar 

  • Goff SA, Ricke D, Lan T-H, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100. doi:10.1126/science.1068275

    Article  PubMed  CAS  Google Scholar 

  • Guyot R, de la Mare M, Viader V, Hamon P, Coriton O, Bustamante-Porras J, Poncet V, Campa C, Hamon S, de Kochko A (2009) Microcollinearity in an ethylene receptor coding gene region of the Coffea canephora genome is extensively conserved with Vitis vinifera and other distant dicotyledonous sequenced genomes. BMC Plant Biol 9(1):22. doi:10.1186/1471-2229-9-22

    Article  PubMed  Google Scholar 

  • Guyot R, Lefebvre-Pautigny F, Tranchant-Dubreuil C, Rigoreau M, Hamon P, Leroy T, Hamon S, Poncet V, Crouzillat D, de Kochko A (2012) Ancestral synteny shared between distantly-related plant species from the asterid (Coffea canephora and Solanum Sp.) and rosid (Vitis vinifera) clades. BMC Genomics 13:103. doi:10.1186/1471-2164-13-103

    Article  PubMed  CAS  Google Scholar 

  • Han Y, Korban SS (2008) An overview of the apple genome through BAC-end sequence analysis. Plant Mol Biol 67:581–588. doi:10.1007/s11103-008-9321-9

    Article  PubMed  CAS  Google Scholar 

  • Hong CP, Lee SJ, Park JY, Plaha P, Park YS, Lee YK, Choi JE, Kim KY, Lee JH, Lee J, Jin H, Choi SR, Lim YP (2004) Construction of a BAC library of Korean ginseng and initial analysis of BAC-end sequences. Mol Genet Genomics 271:709–716. doi:10.1007/s00438-004-1021-9

    Article  PubMed  CAS  Google Scholar 

  • Hong CP, Plaha P, Koo DH, Yang TJ, Choi SR, Lee YK, Uhm T, Bang JW, Edwards D, Bancroft I, Park BS, Lee J, Lim YP (2006) A survey of the Brassica rapa genome by BAC-end sequence analysis and comparison with Arabidopsis thaliana. Mol Cells 22(3):300–307

    PubMed  Google Scholar 

  • Hsu CC, Chung YL, Chen TC, Lee YL, Kuo YT, Tsai WC, Hsiao YY, Chen YW, Wu WL, Chen HH (2011) An overview of the Phalaenopsis orchid genome through BAC-end sequence analysis. BMC Plant Biol 11:3. doi:10.1186/1471-2229-11-3

    Article  PubMed  CAS  Google Scholar 

  • Huang XQ, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9(9):868–877. doi:10.1101/gr.9.9.868

    Article  PubMed  CAS  Google Scholar 

  • Huo N, Lazo GR, Vogel JP, You FM, Ma Y, Hayden DM, Coleman-Derr D, Hill TA, Dvorak J, Anderson OD, Luo MC, Gu YQ (2008) The nuclear genome of Brachypodium distachyon: analysis of BAC-end sequences. Funct Integr Genomics 8:135–147. doi:10.1007/s10142-007-0062-7

    Article  PubMed  CAS  Google Scholar 

  • Jaillon O, Aury J, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C, Horner D, Mica E, Jublot D, Poulain J, Bruyère C, Billault A, Segurens B, Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C, Alaux M, Di Gaspero G, Dumas V, Felice N, Paillard S, Juman I, Moroldo M, Scalabrin S, Canaguier A, Le Clainche I, Malacrida G, Durand E, Pesole G, Laucou V, Chatelet P, Merdinoglu D, Delledonne M, Pezzotti M, Lecharny A, Scarpelli C, Artiguenave F, Pè ME, Valle G, Morgante M, Caboche M, Adam-Blondon A, Weissenbach J, Quétier F, Wincker P (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449:463–467. doi:10.1038/nature06148

    Article  PubMed  CAS  Google Scholar 

  • Jansen RK, Kaittanis C, Saski C, Lee SB, Tomkins J, Alverson AJ, Daniell H (2006) Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC Evol Biol 6:32. doi:10.1186/1471-2148-6-32

    Article  PubMed  Google Scholar 

  • Jetty SS, Luo M, Goicoechea J, Wang W, Kudrna D, Mueller C, Talag J, Kim H, Sisneros N, Blackmon B, Fang E, Tomkins J, Brar D, MacKil D, McCouch S, Kurata N, Lambert G, Galbraith G, Arumuganathan K, Rao R, Walling J, Gill N, Yu Y, SanMiguel P, Soderlund C, Jackson S, Wing R (2006) The Oryza BAC library resource: construction and analysis of 12 deep-coverage large-insert BAC libraries. Genome Res 16(1):140–147

    Google Scholar 

  • Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J (2005) Repbase update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110(1–4):462–467. doi:10.1159/000084979

    Article  PubMed  CAS  Google Scholar 

  • Kohany O, Gentles AJ, Hankus L, Jurka J (2006) Annotation, submission and screening of repetitive elements in Repbase: Repbase submitter and censor. BMC Bioinformatics 7:474. doi:10.1186/1471-2105-7-474

    Article  PubMed  Google Scholar 

  • Lai CWJ, Yu Q, Hou S, Skelton RL, Jones MR, Lewis KLT, Murray J, Eustice M, Guan P, Agbayani R, Moore PH, Ming R, Presting GG (2006) Analysis of papaya BAC-end sequences reveals first insights into the organization of a fruit tree genome. Mol Genet Genomics 276:1–12. doi:10.1007/s00438-006-0122-z

    Article  PubMed  CAS  Google Scholar 

  • Lashermes P, Couturon E, Charrier A (1994) Doubled haploids of Coffea canephora—development, fertility and agronomic characteristics. Euphytica 74:149–157. doi:10.1007/BF00033781

    Article  Google Scholar 

  • Lefebvre-Pautigny F, Wu F, Philippot M, Rigoreau M, Priyono, Zouine M, Frasse P, Bouzayen M, Broun P, Pétiard V et al (2010) High resolution synteny maps allowing direct comparisons between the coffee and tomato genomes. Tree Genet Genomes 6(4):565–577. doi:10.1007/s11295-010-0272-3

    Article  Google Scholar 

  • Mahé L, Combes MC, Lashermes P (2007) Comparison between a coffee single copy chromosomal region and Arabidopsis duplicated counterparts evidenced high level synteny between the coffee genome and the ancestral Arabidopsis genome. Plant Mol Biol 64:699–711. doi:10.1007/s11103-007-9191-6

    Article  PubMed  Google Scholar 

  • Mao L, Wood TC, Yu Y, Budiman MA, Tomkins JP, Woo S-S, Sasinowski M, Presting G, Frisch D, Goff S, Dean RA, Wing RA (2000) Rice transposable elements: a survey of 73000 sequence tagged connectors. Genome Res 10:982–990. doi:10.1101/gr.10.7.982

    Article  PubMed  CAS  Google Scholar 

  • Messing J, Bharti AK, Karlowski KM, Gundlach H, Kim HR, Yu Y, Wei F, Fuks G, Soderlund CA, Mayer KFX, Wing RA (2004) Sequence composition and genome organization of maize. Proc Natl Acad Sci USA 101:14349–14354. doi:10.1073/pnas.0406163101

    Article  PubMed  CAS  Google Scholar 

  • Noirot M, Poncet V, Barre P, Hamon P, Hamon S, de Kochko A (2003) Genome size variations in diploid African Coffea species. Ann Bot London 92(5):709–714. doi:10.1093/aob/mcg183

    Article  CAS  Google Scholar 

  • Paux E, Roger D, Badaeva E, Gay G, Bernard M, Sourdille P, Feuillet C (2006) Characterizing the composition and evolution of homoeologous genomes in hexaploid wheat through BAC-end sequencing on chromosome 3B. Plant J 48:463–474. doi:10.1111/j.1365-313X.2006.02891.x

    Article  PubMed  CAS  Google Scholar 

  • Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J (2003) TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19(5):651–652. doi:10.1093/bioinformatics/btg034

    Article  PubMed  CAS  Google Scholar 

  • Poncet V, Rondeau M, Tranchant C, Cayrel A, Hamon S, de Kochko A, Hamon P (2006) SSR mining in coffee tree EST databases: potential use of EST–SSRs as markers for the Coffea genus. Mol Genet Genomics 276:436–449. doi:10.1007/s00438-006-0153-5

    Article  PubMed  CAS  Google Scholar 

  • Poncet V, Dufour M, Hamon P, Hamon S, de Kochko A, Leroy T (2007) Development of genomic microsatellite markers in Coffea canephora and their transferability to other coffee species. Genome 50(12):1156–1161. doi:10.1139/G07-073

    Article  PubMed  CAS  Google Scholar 

  • Robbrecht E, Manen JF (2006) The major evolutionary lineages of the coffee family (Rubiaceae, angiosperms). Combined analysis (nDNA and cpDNA) to infer the position of Coptosapelta and Luculia, and supertree construction based on rbcl, rps16, trnL-trnF and atpB-rbcL data. A new classification in two subfamilies, Cinchonoideae and Rubioideae. Syst Geogr 76:85–146

    Google Scholar 

  • Shultz JL, Kazi S, Bashir R, Afzal JA, Lightfoot DA (2007) The development of BAC-end sequence based microsatellite markers and placement in the physical and genetic maps of soybean. Theor Appl Genet 114:1081–1090. doi:10.1007/s00122-007-0501-9

    Article  PubMed  CAS  Google Scholar 

  • Terol J, Naranjo MA, Ollitrault P, Talon M (2008) Development of genomic resources for Citrus clementina: characterization of three deep coverage BAC libraries and analysis of 46,000 BAC-end sequences. BMC Genomics 9:423. doi:10.1186/1471-2164-9-423

    Article  PubMed  Google Scholar 

  • Tomato Genome Consortium (2012) The tomato genome sequence provides insights into fleshly fruit evolution. Nature 485(7400):635–641. doi:10.1038/nature11119

    Article  Google Scholar 

  • Vidal RO, Costa Mondego JM, Pot D, Ambrósio AB, Andrade AC, Protasio Oereira LF, Colombo CA, Esteves Vieira LG, Carazzolle MF, Pereira G (2010) A high-throughput data mining of single nucleotide polymorphisms in Coffea species expressed sequence tags suggests differential homeologous gene expression in the allotetraploid Coffea arabica. Plant Physiol 154:1053–1066. doi:10.1104/pp.110.162438

    Article  PubMed  CAS  Google Scholar 

  • Wikström N, Savolainen V, Chase MW (2001) Evolution of the angiosperms: calibrating the family tree. Proc R Soc B Biol Sci 268(1482):2211–2220. doi:10.1098/rspb.2001.1782

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by a grant from the Agence Nationale de la Recherche (ANR; Genoplante ANR-08-GENM-022-001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philippe Lashermes.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 234 kb)

Supplementary material 2 (TXT 4838 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Dereeper, A., Guyot, R., Tranchant-Dubreuil, C. et al. BAC-end sequences analysis provides first insights into coffee (Coffea canephora P.) genome composition and evolution. Plant Mol Biol 83, 177–189 (2013). https://doi.org/10.1007/s11103-013-0077-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11103-013-0077-5

Keywords

Navigation