Skip to main content
Log in

Identification and mapping of conserved ortholog set (COS) II sequences of cacao and their conversion to SNP markers for marker-assisted selection in Theobroma cacao and comparative genomics studies

  • Original Paper
  • Published:
Tree Genetics & Genomes Aims and scope Submit manuscript

Abstract

Theobroma cacao (cacao) is a tree cultivated in the tropics around the world for its seeds that are the source of both chocolate and cocoa butter. Genetic marker development for marker-assisted selection (MAS) is critical for the success of cacao breeding for disease resistance and yield. To develop conserved ortholog set II (COSII) single-nucleotide polymorphism (SNP) markers for MAS in cacao, we have used three strategies and three types of cacao genetic and sequence data to identify and map 98 cacao COSII genes. The resources available at the time these studies were first undertaken dictated the strategy utilized. For the first strategy, SNPs were identified using cacao expressed sequence tags homologous to COSII sequences. Strategy II utilized a leaf transcriptome of cacao genotype “Matina 1–6” and Strategy III the genomic sequence of a 3-Mb region of “Matina 1–6” linkage group 5 associated with an important quantitative trait locus (QTL) for resistance to black pod. We have identified SNP markers for 83 of the 98 mapped COSII genes, and 19 of these SNP markers co-locate with QTLs. These COSII SNP markers, the first identified for cacao, will be used for genotyping and off-typing in cacao breeding programs and employed for genetic mapping and syntenic studies to trace co-location of genes regulating traits of importance between cacao and other species.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Alves RM, Sebbenn AM, Artero AS, Figueira A (2006) Microsatellite loci transferability from Theobroma cacao to Theobroma grandiflorum. Mol Ecol Notes 6(4):1219–1221. doi:10.1111/j.1471-8286.2006.01496.x

    Article  CAS  Google Scholar 

  • Argout X, Fouet O, Wincker P, Gramacho K, Legavre T, Sabau X, Risterucci AM, Da Silva C, Cascardo J, Allegre M, Kuhn D, Verica J, Courtois B, Loor G, Babin R, Sounigo O, Ducamp M, Guiltinan MJ, Ruiz M, Alemanno L, Machado R, Phillips W, Schnell R, Gilmour M, Rosenquist E, Butler D, Maximova S, Lanaud C (2008) Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions. BMC Genom 9:19. doi:10.1186/1471-2164-9-512

    Article  Google Scholar 

  • Argout X, Salse J, Aury JM, Guiltinan MJ, Droc G, Gouzy J, Allegre M, Chaparro C, Legavre T, Maximova SN, Abrouk M, Murat F, Fouet O, Poulain J, Ruiz M, Roguet Y, Rodier-Goud M, Barbosa-Neto JF, Sabot F, Kudrna D, Ammiraju JS, Schuster SC, Carlson JE, Sallet E, Schiex T, Dievart A, Kramer M, Gelley L, Shi Z, Berard A, Viot C, Boccara M, Risterucci AM, Guignon V, Sabau X, Axtell MJ, Ma Z, Zhang Y, Brown S, Bourge M, Golser W, Song X, Clement D, Rivallan R, Tahi M, Akaza JM, Pitollat B, Gramacho K, D’Hont A, Brunel D, Infante D, Kebe I, Costet P, Wing R, McCombie WR, Guiderdoni E, Quetier F, Panaud O, Wincker P, Bocs S, Lanaud C (2011) The genome of Theobroma cacao. Nat Genet 43(2):101–108. doi:10.1038/ng.736

    Article  PubMed  CAS  Google Scholar 

  • Bailey BA, Strem MD, Bae HH, de Mayolo GA, Guiltinan MJ (2005) Gene expression in leaves of Theobroma cacao in response to mechanical wounding, ethylene, and/or methyl jasmonate. Plant Sci 168(5):1247–1258. doi:10.1016/j.plantsci.2005.01.002

    Article  CAS  Google Scholar 

  • Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucl Acids Res 27(2):573–580

    Article  PubMed  CAS  Google Scholar 

  • Borrone JW, Kuhn DN, Schnell RJ (2004) Isolation, characterization, and development of WRKY genes as useful genetic markers in Theobroma cacao. Theor Appl Genet 109(3):495–507

    Article  PubMed  CAS  Google Scholar 

  • Borrone JW, Brown JS, Kuhn DN, Motamayor JC, Schnell RJ (2007) Microsatellite markers developed from Theobroma cacao L. expressed sequence tags. Mol Ecol Notes 7(2):236–239

    Article  CAS  Google Scholar 

  • Brown JS, Schnell RJ, Motamayor JC, Lopes U, Kuhn DN, Borrone JW (2005) Resistance gene mapping for witches’ broom disease in Theobroma cacao L. in an F-2 population using SSR markers and candidate genes. J Amer Soc Hort Sci 130(3):366–373

    CAS  Google Scholar 

  • Brown JS, Phillips-Mora W, Power EJ, Krol C, Cervantes-Martinez C, Motamayor JC, Schnell RJ (2007) Mapping QTLs for resistance to frosty pod and black pod diseases and horticultural traits in Theobroma cacao L. Crop Sci 47(5):1851–1858. doi:10.2135/cropsci2006.11.0753

    Article  Google Scholar 

  • Brown J, Sautter R, Olano C, Borrone J, Kuhn D, Motamayor J, Schnell R (2008) A composite linkage map from three crosses between commercial clones of cacao, Theobroma cacao L. Trop Plant Biol 1(2):120–130

    Article  Google Scholar 

  • Cabrera A, Kozik A, Howad W, Arus P, Iezzoni AF, van der Knaap E (2009) Development and bin mapping of a Rosaceae conserved ortholog set (COS) of markers. BMC Genom 10. doi:10.1186/1471-2164-10-562

  • Carter J, Smith Z, Mockaitis K (in press) Library preparation for transcriptome discovery using long read 454 sequencing. In: Springer P (ed) Methods in Molecular Biology: Plant Functional Genomics, Springer, New York

  • Chapman MA, Chang J, Weisman D, Kesseli RV, Burke JM (2007) Universal markers for comparative mapping and phylogenetic analysis in the Asteraceae (Compositae). Theor Appl Genet 115(6):747–755. doi:10.1007/s00122-007-0605-2

    Article  PubMed  CAS  Google Scholar 

  • Clement D, Risterucci AM, Motamayor JC, N’Goran J, Lanaud C (2003a) Mapping QTL for yield components, vigor, and resistance to Phytophthora palmivora in Theobroma cacao L. Genome 46(2):204–212. doi:10.1139/g02-125

    Article  PubMed  CAS  Google Scholar 

  • Clement D, Risterucci AM, Motamayor JC, N’Goran J, Lanaud C (2003b) Mapping quantitative trait loci for bean traits and ovule number in Theobroma cacao L. Genome 46(1):103–111. doi:10.1139/g02-118

    Article  PubMed  CAS  Google Scholar 

  • Collard BC, Mackill DJ (2008) Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Philos Trans R Soc Lond B Biol Sci 363(1491):557–572. doi:10.1098/rstb.2007.2170

    Article  PubMed  CAS  Google Scholar 

  • Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8(3):186–194

    PubMed  CAS  Google Scholar 

  • Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8(3):175–185

    PubMed  CAS  Google Scholar 

  • Feltus FA, Saski CA, Mockaitis K, Haiminen N, Parida L, Smith Z, Ford J, Staton ME, Ficklin SP, Blackmon BP, Schnell RJ, Kuhn DN, Motamayor JC (2011) Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes. BMC Genomics 12(1):379. doi:10.1186/1471-2164-12-379

    Google Scholar 

  • Fulton TM, Van der Hoeven R, Eannetta NT, Tanksley SD (2002) Identification, analysis, and utilization of conserved ortholog set markers for comparative genomics in higher plants. Plant Cell 14(7):1457–1467. doi:10.1105/tpc.010479

    Article  PubMed  CAS  Google Scholar 

  • Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res 8(3):195–202

    PubMed  CAS  Google Scholar 

  • Hospital F (2009) Challenges for effective marker-assisted selection in plants. Genetica 136(2):303–310. doi:10.1007/s10709-008-9307-1

    Article  PubMed  Google Scholar 

  • Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9(9):868–877

    Article  PubMed  CAS  Google Scholar 

  • Irish BM, Goenaga R, Zhang DP, Schnell R, Brown JS, Motamayor JC (2010) Microsatellite fingerprinting of the USDA-ARS Tropical Agriculture Research Station cacao (Theobroma cacao L.) germplasm collection. Crop Sci 50(2):656–667. doi:10.2135/cropsci2009.06.0299

    Article  CAS  Google Scholar 

  • Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110(1–4):462–467

    Article  PubMed  CAS  Google Scholar 

  • Kosambi DD (1944) The estimation of map distance from recombination values. Annals of Eugenics 12:172–175

    Article  Google Scholar 

  • Krutovsky KV, Elsik CG, Matvienko M, Kozik A, Neale DB (2007) Conserved ortholog sets in forest trees. Tree Genet Genom 3(1):61–70. doi:10.1007/s11295-006-0052-2

    Article  Google Scholar 

  • Kuhn DN, Heath M, Wisser RJ, Meerow A, Brown JS, Lopes U, Schnell RJ (2003) Resistance gene homologues in Theobroma cacao as useful genetic markers. Theor Appl Genet 107(2):191–202

    Article  PubMed  CAS  Google Scholar 

  • Kuhn DN, Borrone J, Meerow AW, Motamayor JC, Brown JS, Schnell RJ (2005) Single-strand conformation polymorphism analysis of candidate genes for reliable identification of alleles by capillary array electrophoresis. Electrophoresis 26(1):112–125

    Article  PubMed  CAS  Google Scholar 

  • Kuhn DN, Narasimhan G, Nakamura K, Brown JS, Schnell RJ, Meerow AW (2006) Identification of cacao TIR-NBS-LRR resistance gene homologues and their use as genetic markers. J Amer Soc Hort Sci 131(6):806–813

    CAS  Google Scholar 

  • Kuhn DN, Motamayor JC, Meerow AW, Borrone JW, Schnell RJ (2008) SSCP markers provide a useful alternative to microsatellites in genotyping and estimating genetic diversity in populations and germplasm collections of plant specialty crops. Electrophoresis 29(19):4096–4108. doi:10.1002/elps.200700937

    Article  PubMed  CAS  Google Scholar 

  • Kuhn DN, Figueira A, Lopes U, Motamayor JC, Meerow AW, Cariaga K, Freeman B, Livingstone DS, Schnell RJ (2010) Evaluating Theobroma grandiflorum for comparative genomic studies with Theobroma cacao. Tree Genet Genom 6(5):783–792. doi:10.1007/s11295-010-0291-0

    Article  Google Scholar 

  • Lanaud C, Risterucci AM, Pieretti I, Falque M, Bouet A, Lagoda PJL (1999) Isolation and characterization of microsatellites in Theobroma cacao L. Mol Ecol 8(12):2141–2143

    Article  PubMed  CAS  Google Scholar 

  • Lanaud C, Fouet O, Clement D, Boccara M, Risterucci AM, Surujdeo-Maharaj S, Legavre T, Argout X (2009) A meta-QTL analysis of disease resistance traits of Theobroma cacao L. Mol Breeding 24(4):361–374. doi:10.1007/s11032-009-9297-4

    Article  Google Scholar 

  • Lazo GR, Lui N, Gu YQ, Kong X, Coleman-Derr D, Anderson OD (2005) Hybsweeper: a resource for detecting high-density plate gridding coordinates. Biotechniques 39(3):320–322, 324

    Article  PubMed  CAS  Google Scholar 

  • Lefebvre-Pautigny F, Wu FN, Philippot M, Rigoreau M, Priyono ZM, Frasse P, Bouzayen M, Broun P, Petiard V, Tanksley SD, Crouzillat D (2010) High resolution synteny maps allowing direct comparisons between the coffee and tomato genomes. Tree Genet Genom 6(4):565–577. doi:10.1007/s11295-010-0272-3

    Article  Google Scholar 

  • Li S, Chou HH (2004) LUCY2: an interactive DNA sequence quality trimming and vector removal tool. Bioinformatics 20(16):2865–2866

    Article  PubMed  CAS  Google Scholar 

  • Lima LS, Gramacho KP, Carels N, Novais R, Gaiotto FA, Lopes UV, Gesteira AS, Zaidan HA, Cascardo JCM, Pires JL, Micheli F (2009) Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches’ broom disease in cacao. Genet Mol Res 8(3):799–808. doi:10.4238/vol8-3gmr603

    Article  PubMed  CAS  Google Scholar 

  • Livingstone D, Motamayor J, Schnell R, Cariaga K, Freeman B, Meerow A, Brown J, Kuhn D (2011) Development of single nucleotide polymorphism markers in Theobroma cacao and comparison to simple sequence repeat markers for genotyping of Cameroon clones. Mol Breeding 27(1):93–106

    Article  Google Scholar 

  • Luo M, Wing R (2003) An improved method for plant BAC library construction. In: Grotewold E (ed) Methods in molecular biology: plant functional genomics: methods and protocols, vol 236. Human, Totowa, pp 3–20

    Chapter  Google Scholar 

  • Luo MC, Thomas C, You FM, Hsiao J, Ouyang S, Buell CR, Malandro M, McGuire PE, Anderson OD, Dvorak J (2003) High-throughput fingerprinting of bacterial artificial chromosomes using the snapshot labeling kit and sizing of restriction fragments by capillary electrophoresis. Genomics 82(3):378–389

    Article  PubMed  CAS  Google Scholar 

  • Miller JR, Delcher AL, Koren S, Venter E, Walenz BP, Brownley A, Johnson J, Li K, Mobarry C, Sutton G (2008a) Aggressive assembly of pyrosequencing reads with mates. Bioinformatics 24(24):2818–2824

    Article  PubMed  CAS  Google Scholar 

  • Miller NA, Kingsmore SF, Farmer A, Langley RJ, Mudge J, Crow JA, Gonzalez AJ, Schilkey FD, Kim RJ, van Velkinburgh J, May GD, Black CF, Myers MK, Utsey JP, Frost NS, Sugarbaker DJ, Bueno R, Gullans SR, Baxter SM, Day SW, Retzel EF (2008b) Management of high-throughput DNA sequencing projects: Alpheus. J Comput Sci Syst Biol 1:132

    Article  PubMed  CAS  Google Scholar 

  • Motamayor JC, Lachenaud P, Mota J, Loor R, Kuhn DN, Brown JS, Schnell RJ (2008) Geographic and genetic population differentiation of the Amazonian chocolate tree (Theobroma cacao L). PLoS One 3(10):8

    Article  Google Scholar 

  • Motilal L, Butler D (2003) Verification of identities in global cacao germplasm collections. Genet Resour Crop Evol 50(8):799–807

    Article  Google Scholar 

  • Motilal LA, Zhang DP, Umaharan P, Mischke S, Mooleedhar V, Meinhardt LW (2010) The relic Criollo cacao in Belize—genetic diversity and relationship with Trinitario and other cacao clones held in the International Cocoa Genebank, Trinidad. Plant Genet Resour-Charact Util 8(2):106–115. doi:10.1017/s1479262109990232

    Article  CAS  Google Scholar 

  • Pugh T, Fouet O, Risterucci AM, Brottier P, Abouladze M, Deletrez C, Courtois B, Clement D, Larmande P, N’Goran JAK, Lanaud C (2004) A new cacao linkage map based on codominant markers: development and integration of 201 new microsatellite markers. Theor Appl Genet 108(6):1151–1161. doi:10.1007/s00122-003-1533-4

    Article  PubMed  CAS  Google Scholar 

  • Quraishi UM, Abrouk M, Bolot S, Pont C, Throude M, Guilhot N, Confolent C, Bortolini F, Praud S, Murigneux A, Charmet G, Salse J (2009) Genomics in cereals: from genome-wide conserved orthologous set (COS) sequences to candidate genes for trait dissection. Func & Integ Genom 9(4):473–484. doi:10.1007/s10142-009-0129-8

    Article  CAS  Google Scholar 

  • Rafalski JA (2002) Novel genetic mapping tools in plants: SNPs and LD-based approaches. Plant Sci 162(3):329–333

    Article  CAS  Google Scholar 

  • Risterucci AM, Grivet L, N’Goran JAK, Pieretti I, Flament MH, Lanaud C (2000) A high-density linkage map of Theobroma cacao L. Theor Appl Genet 101(5–6):948–955

    Article  CAS  Google Scholar 

  • Risterucci AM, Paulin D, Ducamp M, N’Goran JA, Lanaud C (2003) Identification of QTLs related to cocoa resistance to three species of Phytophthora. Theor Appl Genet 108(1):168–174. doi:10.1007/s00122-003-1408-8

    Article  PubMed  CAS  Google Scholar 

  • Rong J, Feltus FA, Waghmare VN, Pierce GJ, Chee PW, Draye X, Saranga Y, Wright RJ, Wilkins TA, May OL, Smith CW, Gannaway JR, Wendel JF, Paterson AH (2007) Meta-analysis of polyploid cotton QTL shows unequal contributions of subgenomes to a complex network of genes and gene clusters implicated in lint fiber development. Genetics 176(4):2577–2588. doi:10.1534/genetics.107.074518

    Article  PubMed  CAS  Google Scholar 

  • Rounsley S, Marri P, Yu Y, He R, Sisneros N, Goicoechea J, Lee S, Angelova A, Kudrna D, Luo M, Affourtit J, Desany B, Knight J, Niazi F, Egholm M, Wing R (2009) De novo next generation sequencing of plant genomes. Rice 2(1):35–43

    Article  Google Scholar 

  • Sambrook JFE, Maniatis T (1989) Molecular cloning: a laboratory manual. Cold Spring Harbor, Cold Spring Harbor

    Google Scholar 

  • Saski CA, Feltus FA, Staton ME, Blackmon BP, Ficklin SP, Kuhn DN, Schnell RJ, Shapiro H, Motamayor JC (2011) A genetically anchored physical framework for Theobroma cacao cv. Matina 1-6. BMC Genomics 12(1):413. doi:10.1186/1471-2164-12-413

    Google Scholar 

  • Schnell RJ, Kuhn DN, Brown JS, Olano CT, Phillips-Mora W, Amores FM, Motamayor JC (2007) Development of a marker assisted selection program for cacao. Phytopathol 97(12):1664–1669

    Article  CAS  Google Scholar 

  • Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, Delcher AL, Jaiswal P, Mockaitis K, Liston A, Mane SP, Burns P, Davis TM, Slovin JP, Bassil N, Hellens RP, Evans C, Harkins T, Kodira C, Desany B, Crasta OR, Jensen RV, Allan AC, Michael TP, Setubal JC, Celton JM, Rees DJ, Williams KP, Holt SH, Ruiz Rojas JJ, Chatterjee M, Liu B, Silva H, Meisel L, Adato A, Filichkin SA, Troggio M, Viola R, Ashman TL, Wang H, Dharmawardhana P, Elser J, Raja R, Priest HD, Bryant DW Jr, Fox SE, Givan SA, Wilhelm LJ, Naithani S, Christoffels A, Salama DY, Carter J, Lopez Girona E, Zdepski A, Wang W, Kerstetter RA, Schwab W, Korban SS, Davik J, Monfort A, Denoyes-Rothan B, Arus P, Mittler R, Flinn B, Aharoni A, Bennetzen JL, Salzberg SL, Dickerman AW, Velasco R, Borodovsky M, Veilleux RE, Folta KM (2011) The genome of woodland strawberry (Fragaria vesca). Nat Genet 43(2):109–116. doi:10.1038/ng.740

    Article  PubMed  CAS  Google Scholar 

  • Smit A, Hubley R, Green P (1996–2010) RepeatMasker Open-3.0. Available at http://www.repeatmasker.org

  • Soderlund C, Humphray S, Dunham A, French L (2000) Contigs built with fingerprints, markers, and FPC V4.7. Genome Res 10(11):1772–1787

    Article  PubMed  CAS  Google Scholar 

  • Stephens M, Sloan JS, Robertson PD, Scheet P, Nickerson DA (2006) Automating sequence-based detection and genotyping of SNPs from diploid samples. Nat Genet 38(3):375–381

    Article  PubMed  CAS  Google Scholar 

  • Van Ooijen JW (2006) JoinMap 4, software for the calculation of genetic linkage maps in experimental populations, 4th edn. Kyazma B. V, Wageningen

    Google Scholar 

  • Wu FN, Mueller LA, Crouzillat D, Petiard V, Tanksley SD (2006) Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: a test case in the euasterid plant clade. Genetics 174(3):1407–1420. doi:10.1534/genetics.106.062455

    Article  PubMed  CAS  Google Scholar 

  • Wu FN, Eannetta NT, Xu YM, Durrett R, Mazourek M, Jahn MM, Tanksley SD (2009a) A COSII genetic map of the pepper genome provides a detailed picture of synteny with tomato and new insights into recent chromosome evolution in the genus Capsicum. Theor Appl Genet 118(7):1279–1293. doi:10.1007/s00122-009-0980-y

    Article  PubMed  CAS  Google Scholar 

  • Wu FN, Eannetta NT, Xu YM, Tanksley SD (2009b) A detailed synteny map of the eggplant genome based on conserved ortholog set II (COSII) markers. Theor Appl Genet 118(5):927–935. doi:10.1007/s00122-008-0950-9

    Article  PubMed  CAS  Google Scholar 

  • Zhang DP, Boccara M, Motilal L, Mischke S, Johnson ES, Butler DR, Bailey B, Meinhardt L (2009) Molecular characterization of an earliest cacao (Theobroma cacao L.) collection from Upper Amazon using microsatellite DNA markers. Tree Genet Genom 5(4):595–607. doi:10.1007/s11295-009-0212-2

    Article  Google Scholar 

Download references

Acknowledgments

We wish to acknowledge Mars, Inc. for partial funding of this project, Barbie Freeman for excellent technical support, Dr. Belinda Martineau for editing the manuscript, and Dr. J. Michael Moore for assistance with the statistical analysis of the synteny data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David N. Kuhn.

Additional information

Communicated by A. Dandekar

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

Table of all SNPs for the 83 COSII loci that contained SNPs. Table contains linkage group information for each COSII locus and minor allele frequency (MAF) for each SNP. (PDF 116 kb)

ESM 2

List of all flanking sequences for SNPs in ESM 1 in fasta format. (PDF 31 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuhn, D.N., Livingstone, D., Main, D. et al. Identification and mapping of conserved ortholog set (COS) II sequences of cacao and their conversion to SNP markers for marker-assisted selection in Theobroma cacao and comparative genomics studies. Tree Genetics & Genomes 8, 97–111 (2012). https://doi.org/10.1007/s11295-011-0424-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11295-011-0424-0

Keywords

Navigation