Plant Molecular Biology

, Volume 83, Issue 3, pp 177–189 | Cite as

BAC-end sequences analysis provides first insights into coffee (Coffea canephora P.) genome composition and evolution

  • Alexis Dereeper
  • Romain Guyot
  • Christine Tranchant-Dubreuil
  • François Anthony
  • Xavier Argout
  • Fabien de Bellis
  • Marie-Christine Combes
  • Frederick Gavory
  • Alexandre de Kochko
  • Dave Kudrna
  • Thierry Leroy
  • Julie Poulain
  • Myriam Rondeau
  • Xiang Song
  • Rod Wing
  • Philippe Lashermes
Article

Abstract

Coffee is one of the world’s most important agricultural commodities. Coffee belongs to the Rubiaceae family in the euasterid I clade of dicotyledonous plants, to which the Solanaceae family also belongs. Two bacterial artificial chromosome (BAC) libraries of a homozygous doubled haploid plant of Coffea canephora were constructed using two enzymes, HindIII and BstYI. A total of 134,827 high quality BAC-end sequences (BESs) were generated from the 73,728 clones of the two libraries, and 131,412 BESs were conserved for further analysis after elimination of chloroplast and mitochondrial sequences. This corresponded to almost 13 % of the estimated size of the C. canephora genome. 6.7 % of BESs contained simple sequence repeats, the most abundant (47.8 %) being mononucleotide motifs. These sequences allow the development of numerous useful marker sites. Potential transposable elements (TEs) represented 11.9 % of the full length BESs. A difference was observed between the BstYI and HindIII libraries (14.9 vs. 8.8 %). Analysis of BESs against known coding sequences of TEs indicated that 11.9 % of the genome corresponded to known repeat sequences, like for other flowering plants. The number of genes in the coffee genome was estimated at 41,973 which is probably overestimated. Comparative genome mapping revealed that microsynteny was higher between coffee and grapevine than between coffee and tomato or Arabidopsis. BESs constitute valuable resources for the first genome wide survey of coffee and provide new insights into the composition and evolution of the coffee genome.

Keywords

Comparative genomics Coffea Genome BAC library Transposable elements Microsatellites 

Notes

Acknowledgments

This research was supported by a grant from the Agence Nationale de la Recherche (ANR; Genoplante ANR-08-GENM-022-001).

Supplementary material

11103_2013_77_MOESM1_ESM.doc (234 kb)
Supplementary material 1 (DOC 234 kb)
11103_2013_77_MOESM2_ESM.txt (4.7 mb)
Supplementary material 2 (TXT 4838 kb)

References

  1. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402. doi: 10.1093/nar/25.17.3389 PubMedCrossRefGoogle Scholar
  2. Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408(6814):796–815CrossRefGoogle Scholar
  3. Berardini TZ, Mundodi S, Reiser L, Huala E, Garcia-Hernandez M, Zhang P, Mueller LA, Yoon J, Doyle A, Lander G, Moseyko N, Yoo D, Xu I, Zoeckler B, Montoya M, Miller N, Weems D, Rhee SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol 135(2):745–755. Epub 2004 Jun 1Google Scholar
  4. Blanc G, Barakat A, Guyot R, Cooke R, Delseny M (2000) Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell 12:1093–1101. doi: 10.2307/3871257 PubMedGoogle Scholar
  5. Cavagnaro PF, Chung SM, Szklarczyk M, Grzebelus D, Senalik D, Atkins AE, Simon PW (2009) Characterization of a deep coverage carrot (Daucus carota L.) BAC library and initial analysis of BAC-end sequences. Mol Genet Genomics 281:273–288. doi: 10.1007/s00438-008-0411-9 PubMedCrossRefGoogle Scholar
  6. Cenci A, Combes MC, Lashermes P (2010) Comparative sequence analyses indicate that Coffea (Asterids) and Vitis (Rosids) derive from the same paleo-hexaploid ancestral genome. Mol Genet Genomics 283:493–501. doi: 10.1007/s11103-011-9852-3 PubMedCrossRefGoogle Scholar
  7. Cenci A, Combes MC, Lashermes P (2012) Genome evolution in diploid and tetraploid Coffea species as revealed by comparative analysis of orthologous genome segments. Plant Mol Biol 78:135–145. doi: 10.1007/s11103-011-9852-3 PubMedCrossRefGoogle Scholar
  8. Cenci A, Combes MC, Lashermes P (2013) Differences in evolution rates among eudicotiledon species observed by analysis of protein divergence. J Hered. doi: 10.1093/jhered/est025 PubMedGoogle Scholar
  9. Cheng X, Xu J, Xia S, Gu J, Yang Y, Fu J, Qian X, Zhang S, Wu J, Liu K (2009) Development and genetic mapping of microsatellite markers from genome survey sequences in Brassica napus. Theor Appl Genet 118:1121–1131. doi: 10.1007/s00122-009-0967-8 PubMedCrossRefGoogle Scholar
  10. Cheung F, Town CD (2007) A BAC-end view of the Musa acuminata genome. BMC Plant Biol 7:29. doi: 10.1186/1471-2229-7-29 PubMedCrossRefGoogle Scholar
  11. Conesa A, Götz S (2008) Blast2GO: a comprehensive suite for functional analysis in plant genomics. Int J Plant Genomics 2008:1–13CrossRefGoogle Scholar
  12. Couturon E, Berthaud J (1982) Présentation d’une méthode de récupération d’haploïde spontanés découverts chez le Coffea canephora var. robusta. Café Cacao Thé 19(3):267–270Google Scholar
  13. Datema E, Mueller LA, Buels R, Giovannoni JJ, Visser RGF, Stiekema WJ, van Ham CHJ (2008) Comparative BAC-end sequence analysis of tomato and potato reveals overrepresentation of specific gene families in potato. BMC Plant Biol 8:34. doi: 10.1186/1471-2229-8-34 PubMedCrossRefGoogle Scholar
  14. Davis AP, Tosh J, Ruch N, Fay MF (2011) Growing coffee: Psilanthus (Rubiaceae) subsumed on the basis of molecular and morphological data; implications for the size, morphology, distribution and evolutionary history of Coffea. Bot J Linn Soc 167(4):357–377. doi: 10.1111/j.1095-8339.2011.01177.x CrossRefGoogle Scholar
  15. Febrer M, Cheung F, Town CD, Cannon SB, Young ND, Abberton MT, Jenkins G, Milbourne D (2007) Construction, characterization and preliminary BAC-end sequencing analysis of a bacterial artificial chromosome library of white clover (Trifolium repens L.). Genome 50:412–421. doi: 10.1139/G07-013 PubMedCrossRefGoogle Scholar
  16. Feuillet C, Leach JE, Rogers J, Schnable PS, Eversole K (2011) Crop genome sequencing: lessons and rationales. Trends Plant Sci 16(2):77–88PubMedCrossRefGoogle Scholar
  17. Frelichowski JE Jr, Palmer MB, Main D, Tomkins JP, Cantrell RG, Stelly DM, Yu J, Kohel RJ, Ulloa M (2006) Cotton genome mapping with new microsatellites from Acala ‘Maxxa’ BAC-ends. Mol Genet Genomics 275:479–491. doi: 10.1007/s00438-006-0106-z PubMedCrossRefGoogle Scholar
  18. Goff SA, Ricke D, Lan T-H, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100. doi: 10.1126/science.1068275 PubMedCrossRefGoogle Scholar
  19. Guyot R, de la Mare M, Viader V, Hamon P, Coriton O, Bustamante-Porras J, Poncet V, Campa C, Hamon S, de Kochko A (2009) Microcollinearity in an ethylene receptor coding gene region of the Coffea canephora genome is extensively conserved with Vitis vinifera and other distant dicotyledonous sequenced genomes. BMC Plant Biol 9(1):22. doi: 10.1186/1471-2229-9-22 PubMedCrossRefGoogle Scholar
  20. Guyot R, Lefebvre-Pautigny F, Tranchant-Dubreuil C, Rigoreau M, Hamon P, Leroy T, Hamon S, Poncet V, Crouzillat D, de Kochko A (2012) Ancestral synteny shared between distantly-related plant species from the asterid (Coffea canephora and Solanum Sp.) and rosid (Vitis vinifera) clades. BMC Genomics 13:103. doi: 10.1186/1471-2164-13-103 PubMedCrossRefGoogle Scholar
  21. Han Y, Korban SS (2008) An overview of the apple genome through BAC-end sequence analysis. Plant Mol Biol 67:581–588. doi: 10.1007/s11103-008-9321-9 PubMedCrossRefGoogle Scholar
  22. Hong CP, Lee SJ, Park JY, Plaha P, Park YS, Lee YK, Choi JE, Kim KY, Lee JH, Lee J, Jin H, Choi SR, Lim YP (2004) Construction of a BAC library of Korean ginseng and initial analysis of BAC-end sequences. Mol Genet Genomics 271:709–716. doi: 10.1007/s00438-004-1021-9 PubMedCrossRefGoogle Scholar
  23. Hong CP, Plaha P, Koo DH, Yang TJ, Choi SR, Lee YK, Uhm T, Bang JW, Edwards D, Bancroft I, Park BS, Lee J, Lim YP (2006) A survey of the Brassica rapa genome by BAC-end sequence analysis and comparison with Arabidopsis thaliana. Mol Cells 22(3):300–307PubMedGoogle Scholar
  24. Hsu CC, Chung YL, Chen TC, Lee YL, Kuo YT, Tsai WC, Hsiao YY, Chen YW, Wu WL, Chen HH (2011) An overview of the Phalaenopsis orchid genome through BAC-end sequence analysis. BMC Plant Biol 11:3. doi: 10.1186/1471-2229-11-3 PubMedCrossRefGoogle Scholar
  25. Huang XQ, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9(9):868–877. doi: 10.1101/gr.9.9.868 PubMedCrossRefGoogle Scholar
  26. Huo N, Lazo GR, Vogel JP, You FM, Ma Y, Hayden DM, Coleman-Derr D, Hill TA, Dvorak J, Anderson OD, Luo MC, Gu YQ (2008) The nuclear genome of Brachypodium distachyon: analysis of BAC-end sequences. Funct Integr Genomics 8:135–147. doi: 10.1007/s10142-007-0062-7 PubMedCrossRefGoogle Scholar
  27. Jaillon O, Aury J, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C, Horner D, Mica E, Jublot D, Poulain J, Bruyère C, Billault A, Segurens B, Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C, Alaux M, Di Gaspero G, Dumas V, Felice N, Paillard S, Juman I, Moroldo M, Scalabrin S, Canaguier A, Le Clainche I, Malacrida G, Durand E, Pesole G, Laucou V, Chatelet P, Merdinoglu D, Delledonne M, Pezzotti M, Lecharny A, Scarpelli C, Artiguenave F, Pè ME, Valle G, Morgante M, Caboche M, Adam-Blondon A, Weissenbach J, Quétier F, Wincker P (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449:463–467. doi: 10.1038/nature06148 PubMedCrossRefGoogle Scholar
  28. Jansen RK, Kaittanis C, Saski C, Lee SB, Tomkins J, Alverson AJ, Daniell H (2006) Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC Evol Biol 6:32. doi: 10.1186/1471-2148-6-32 PubMedCrossRefGoogle Scholar
  29. Jetty SS, Luo M, Goicoechea J, Wang W, Kudrna D, Mueller C, Talag J, Kim H, Sisneros N, Blackmon B, Fang E, Tomkins J, Brar D, MacKil D, McCouch S, Kurata N, Lambert G, Galbraith G, Arumuganathan K, Rao R, Walling J, Gill N, Yu Y, SanMiguel P, Soderlund C, Jackson S, Wing R (2006) The Oryza BAC library resource: construction and analysis of 12 deep-coverage large-insert BAC libraries. Genome Res 16(1):140–147Google Scholar
  30. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J (2005) Repbase update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110(1–4):462–467. doi: 10.1159/000084979 PubMedCrossRefGoogle Scholar
  31. Kohany O, Gentles AJ, Hankus L, Jurka J (2006) Annotation, submission and screening of repetitive elements in Repbase: Repbase submitter and censor. BMC Bioinformatics 7:474. doi: 10.1186/1471-2105-7-474 PubMedCrossRefGoogle Scholar
  32. Lai CWJ, Yu Q, Hou S, Skelton RL, Jones MR, Lewis KLT, Murray J, Eustice M, Guan P, Agbayani R, Moore PH, Ming R, Presting GG (2006) Analysis of papaya BAC-end sequences reveals first insights into the organization of a fruit tree genome. Mol Genet Genomics 276:1–12. doi: 10.1007/s00438-006-0122-z PubMedCrossRefGoogle Scholar
  33. Lashermes P, Couturon E, Charrier A (1994) Doubled haploids of Coffea canephora—development, fertility and agronomic characteristics. Euphytica 74:149–157. doi: 10.1007/BF00033781 CrossRefGoogle Scholar
  34. Lefebvre-Pautigny F, Wu F, Philippot M, Rigoreau M, Priyono, Zouine M, Frasse P, Bouzayen M, Broun P, Pétiard V et al (2010) High resolution synteny maps allowing direct comparisons between the coffee and tomato genomes. Tree Genet Genomes 6(4):565–577. doi: 10.1007/s11295-010-0272-3 CrossRefGoogle Scholar
  35. Mahé L, Combes MC, Lashermes P (2007) Comparison between a coffee single copy chromosomal region and Arabidopsis duplicated counterparts evidenced high level synteny between the coffee genome and the ancestral Arabidopsis genome. Plant Mol Biol 64:699–711. doi: 10.1007/s11103-007-9191-6 PubMedCrossRefGoogle Scholar
  36. Mao L, Wood TC, Yu Y, Budiman MA, Tomkins JP, Woo S-S, Sasinowski M, Presting G, Frisch D, Goff S, Dean RA, Wing RA (2000) Rice transposable elements: a survey of 73000 sequence tagged connectors. Genome Res 10:982–990. doi: 10.1101/gr.10.7.982 PubMedCrossRefGoogle Scholar
  37. Messing J, Bharti AK, Karlowski KM, Gundlach H, Kim HR, Yu Y, Wei F, Fuks G, Soderlund CA, Mayer KFX, Wing RA (2004) Sequence composition and genome organization of maize. Proc Natl Acad Sci USA 101:14349–14354. doi: 10.1073/pnas.0406163101 PubMedCrossRefGoogle Scholar
  38. Noirot M, Poncet V, Barre P, Hamon P, Hamon S, de Kochko A (2003) Genome size variations in diploid African Coffea species. Ann Bot London 92(5):709–714. doi: 10.1093/aob/mcg183 CrossRefGoogle Scholar
  39. Paux E, Roger D, Badaeva E, Gay G, Bernard M, Sourdille P, Feuillet C (2006) Characterizing the composition and evolution of homoeologous genomes in hexaploid wheat through BAC-end sequencing on chromosome 3B. Plant J 48:463–474. doi: 10.1111/j.1365-313X.2006.02891.x PubMedCrossRefGoogle Scholar
  40. Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J (2003) TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19(5):651–652. doi: 10.1093/bioinformatics/btg034 PubMedCrossRefGoogle Scholar
  41. Poncet V, Rondeau M, Tranchant C, Cayrel A, Hamon S, de Kochko A, Hamon P (2006) SSR mining in coffee tree EST databases: potential use of EST–SSRs as markers for the Coffea genus. Mol Genet Genomics 276:436–449. doi: 10.1007/s00438-006-0153-5 PubMedCrossRefGoogle Scholar
  42. Poncet V, Dufour M, Hamon P, Hamon S, de Kochko A, Leroy T (2007) Development of genomic microsatellite markers in Coffea canephora and their transferability to other coffee species. Genome 50(12):1156–1161. doi: 10.1139/G07-073 PubMedCrossRefGoogle Scholar
  43. Robbrecht E, Manen JF (2006) The major evolutionary lineages of the coffee family (Rubiaceae, angiosperms). Combined analysis (nDNA and cpDNA) to infer the position of Coptosapelta and Luculia, and supertree construction based on rbcl, rps16, trnL-trnF and atpB-rbcL data. A new classification in two subfamilies, Cinchonoideae and Rubioideae. Syst Geogr 76:85–146Google Scholar
  44. Shultz JL, Kazi S, Bashir R, Afzal JA, Lightfoot DA (2007) The development of BAC-end sequence based microsatellite markers and placement in the physical and genetic maps of soybean. Theor Appl Genet 114:1081–1090. doi: 10.1007/s00122-007-0501-9 PubMedCrossRefGoogle Scholar
  45. Terol J, Naranjo MA, Ollitrault P, Talon M (2008) Development of genomic resources for Citrus clementina: characterization of three deep coverage BAC libraries and analysis of 46,000 BAC-end sequences. BMC Genomics 9:423. doi: 10.1186/1471-2164-9-423 PubMedCrossRefGoogle Scholar
  46. Tomato Genome Consortium (2012) The tomato genome sequence provides insights into fleshly fruit evolution. Nature 485(7400):635–641. doi: 10.1038/nature11119 CrossRefGoogle Scholar
  47. Vidal RO, Costa Mondego JM, Pot D, Ambrósio AB, Andrade AC, Protasio Oereira LF, Colombo CA, Esteves Vieira LG, Carazzolle MF, Pereira G (2010) A high-throughput data mining of single nucleotide polymorphisms in Coffea species expressed sequence tags suggests differential homeologous gene expression in the allotetraploid Coffea arabica. Plant Physiol 154:1053–1066. doi: 10.1104/pp.110.162438 PubMedCrossRefGoogle Scholar
  48. Wikström N, Savolainen V, Chase MW (2001) Evolution of the angiosperms: calibrating the family tree. Proc R Soc B Biol Sci 268(1482):2211–2220. doi: 10.1098/rspb.2001.1782 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Alexis Dereeper
    • 1
  • Romain Guyot
    • 2
  • Christine Tranchant-Dubreuil
    • 2
  • François Anthony
    • 1
  • Xavier Argout
    • 3
  • Fabien de Bellis
    • 3
  • Marie-Christine Combes
    • 1
  • Frederick Gavory
    • 4
  • Alexandre de Kochko
    • 2
  • Dave Kudrna
    • 5
  • Thierry Leroy
    • 3
  • Julie Poulain
    • 4
  • Myriam Rondeau
    • 1
  • Xiang Song
    • 5
  • Rod Wing
    • 5
  • Philippe Lashermes
    • 1
  1. 1.Institut de Recherche pour le Développement (IRD)UMR RPB (CIRAD, IRD, UM2)Montpellier Cedex 5France
  2. 2.Institut de Recherche pour le Développement (IRD)UMR DIADE (CIRAD, IRD, UM2)Montpellier Cedex 5France
  3. 3.Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD)UMR AGAPMontpellierFrance
  4. 4.Commissariat à l’Energie Atomique (CEA)Institut de GénomiqueEvryFrance
  5. 5.Arizona Genomics Institute, School of Plant SciencesUniversity of ArizonaTucsonUSA

Personalised recommendations