, 214:50 | Cite as

High-throughput targeted genotyping using next-generation sequencing applied in Coffea canephora breeding

  • Emilly Ruas Alkimim
  • Eveline Teixeira Caixeta
  • Tiago Vieira Sousa
  • Felipe Lopes da Silva
  • Ney Sussumu Sakiyama
  • Laércio Zambolim


The use of molecular markers to detect polymorphism at DNA level is one of the most significant developments in molecular biology techniques. With the development of new next-generation sequencing technologies, the discovery of SNP became easier and faster, and the costs of data point were reduced. The development and use of SNP markers for coffee have provided new perspectives for the evaluation of genetic diversity and population structure via different statistical approaches. In this study, 72 Coffea canephora genotypes were analyzed to identify the SNP markers and apply them to genetic studies and selection of parents/hybrids in genetic breeding. As many as 117,450 SNP were identified using the RAPiD Genomics platform. After quality analyses, 33,485 SNP were validated for analyses of genetic diversity and population structure. Genotypes were separated based on their varietal groups, and Hybrids were differentiated using the clustering and Bayesian approach. Coffee accessions mistakenly identified in the germplasm and breeding program were detected. The Conilon varietal group presented the lowest genetic dissimilarity values, suggesting the introduction of new accessions in the germplasm bank. The highest genetic distances values were observed among genotypes of the heterotic groups (Conilon and Robusta). The markers were efficient in evaluating the genetic diversity and population structure of C. canephora. Promising crosses were selected within and between the varietal groups. Hybrids with greater genetic distances were selected, which were important for C. canephora breeding programs.


Single nucleotide polymorphism Genetic variability Population structure Conilon Robusta Hybrid 



This work was financially supported by the Brazilian Coffee Research and Development Consortium (Consórcio Brasileiro de Pesquisa e Desenvolvimento do Café - CBP&D/Café), the Foundation for Research Support of the state of Minas Gerais (FAPEMIG), the National Council of Scientific and Technological Development (CNPq), and the National Institutes of Science and Technology of Coffee (INCT/Café).

Compliance with ethical standards

Conflict of interest

The authors declare no conflict of interest.

Supplementary material

10681_2018_2126_MOESM1_ESM.xlsx (727 kb)
Supplementary material 1 (XLSX 727 kb)


  1. Abatepaulo ARR, Caetano AR, Mendes CT Jr et al (2008) Detection of SNPs in bovine immune-response genes that may mediate resistance to the cattle tick Rhipicephalus (Boophilus) microplus. Anim Genet 39:328–329. CrossRefPubMedGoogle Scholar
  2. Anderson CA, Pettersson FH, Clarke GM et al (2010) Data quality control in genetic case-control association studies. Nat Protoc 5:1564–1573. CrossRefPubMedPubMedCentralGoogle Scholar
  3. Babova O, Occhipinti A, Maffei ME (2016) Chemical partitioning and antioxidant capacity of green coffee (Coffea arabica and Coffea canephora) of different geographical origin. Phytochemistry 123:33–39. CrossRefPubMedGoogle Scholar
  4. Berthaud J (1986) Les resources génétique pour l’amélioration des caféiers africains diploides. Evaluation de la richesse génétique des populations sylvestres et de ses mécanismes organisateurs. Conséquences pour l’application. Université de ParisGoogle Scholar
  5. Brito GG, Caixeta ET, Gallina AP et al (2010) Inheritance of coffee leaf rust resistance and identification of AFLP markers linked to the resistance gene. Euphytica 173:255–264. CrossRefGoogle Scholar
  6. Caetano AR (2009) Marcadores SNP: conceitos básicos, aplicações no manejo e no melhoramento animal e perspectivas para o futuro. Rev Bras Zootec 38:64–71. CrossRefGoogle Scholar
  7. Carvalho MCCG, Silva DCG (2010) Sequenciamento de DNA de nova geração e suas aplicações na genômica de plantas. Ciência Rural 40:735–744. CrossRefGoogle Scholar
  8. Cruz CD (2013) GENES—a software package for analysis in experimental statistics and quantitative genetics. Acta Sci Agron 35:271–276. CrossRefGoogle Scholar
  9. Cubry P, Musoli P, Legnate H et al (2008) Diversity in coffee assessed with SSR markers: structure of the genus Coffea and perspectives for breeding. Genome 51:50–63. CrossRefPubMedGoogle Scholar
  10. Cubry P, De Bellis F, Pot D et al (2013) Global analysis of Coffea canephora Pierre ex Froehner (Rubiaceae) from the Guineo-Congolese region reveals impacts from climatic refuges and migration effects. Genet Resour Crop Evol 60:483–501. CrossRefGoogle Scholar
  11. Danecek P, Auton A, Abecasis G et al (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158. CrossRefPubMedPubMedCentralGoogle Scholar
  12. Davis AP, Govaerts R, Bridson DM, Stoffelen P (2006) An annotated taxonomic conspectus of the genus Coffea (Rubiaceae). Bot J Linn Soc 152:465–512. CrossRefGoogle Scholar
  13. Denoeud F, Carretero-Paulet L, Dereeper A et al (2014) The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345:1181–1184. CrossRefPubMedGoogle Scholar
  14. Diniz LEC, Sakiyama NS, Lashermes P et al (2005) Analysis of AFLP markers associated to the Mex-1 resistance locus in Icatu progenies. Crop Breed Appl Biotechnol 5:387–393. CrossRefGoogle Scholar
  15. Diola V, de Brito GG, Caixeta ET et al (2011) High-density genetic mapping for coffee leaf rust resistance. Tree Genet Genomes 7:1199–1208. CrossRefGoogle Scholar
  16. Earl DA, VonHoldt BM (2012) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour 4:359–361. CrossRefGoogle Scholar
  17. Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software structure: a simulation study. Mol Ecol 14:2611–2620. CrossRefPubMedGoogle Scholar
  18. Ferrão LF, Caixeta ET, de Souza Fd et al (2013) Comparative study of different molecular markers for classifying and establishing genetic relationships in Coffea canephora. Plant Syst Evol 299:225–238. CrossRefGoogle Scholar
  19. Ferrão LFV, Caixeta ET, Pena G et al (2015) New EST–SSR markers of Coffea arabica: transferability and application to studies of molecular characterization and genetic mapping. Mol Breed 35:31. CrossRefGoogle Scholar
  20. Ferrão LFV, Ferrão RG, Ferrão MAG et al (2017) A mixed model to multiple harvest-location trials applied to genomic prediction in Coffea canephora. Tree Genet Genomes 13:95. CrossRefGoogle Scholar
  21. Gabriel SB, Schaffner SF, Nguyen H et al (2002) The structure of haplotype blocks in the human genome. Science 296:2225–2229. CrossRefPubMedGoogle Scholar
  22. Garavito A, Montagnon C, Guyot R, Bertrand B (2016) Identification by the DArTseq method of the genetic origin of the Coffea canephora cultivated in Vietnam and Mexico. BMC Plant Biol 16:242. CrossRefPubMedPubMedCentralGoogle Scholar
  23. Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencingGoogle Scholar
  24. Gartner GAL, McCouch SR, Moncada MDP (2013) A genetic map of an interspecific diploid pseudo testcross population of coffee. Euphytica 192:305–323. CrossRefGoogle Scholar
  25. Gnirke A, Melnikov A, Maguire J et al (2009) Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol 27:182–189. CrossRefPubMedPubMedCentralGoogle Scholar
  26. Gomez C, Dussert S, Hamon P et al (2009) Current genetic differentiation of Coffea canephora Pierre ex A. Froehn in the Guineo-Congolian African zone: cumulative impact of ancient climatic changes and recent human activities. BMC Evol Biol 9:1–19. CrossRefGoogle Scholar
  27. Grandillo S (2014) Introgression libraries with wild relatives of crops. Genomics Plant Genet Resour 2:87–122. CrossRefGoogle Scholar
  28. Hamon P, Grover CE, Davis AP et al (2017) Genotyping-by-sequencing provides the first well-resolved phylogeny for coffee (Coffea) and insights into the evolution of caffeine content in its species. Mol Phylogenet Evol 109:351–361. CrossRefPubMedGoogle Scholar
  29. Kosman E, Leonard KJ (2005) Similarity coefficients for molecular markers in studies of genetic relationships between individuals for haploid, diploid, and polyploid species. Mol Ecol 14:415–424. CrossRefPubMedGoogle Scholar
  30. Krzywinski M, Schein J, Birol I et al (2009) Circos: an information aesthetic for comparative genomics. Genome Res 19:1639–1645. CrossRefPubMedPubMedCentralGoogle Scholar
  31. Kumar S, Stecher G, Tamura K (2016) MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol 33:1870–1874. CrossRefPubMedGoogle Scholar
  32. Lashermes P, Andrzejewski S, Bertrand B et al (2000) Molecular analysis of introgressive breeding in coffee (Coffea arabica L.). Theor Appl Genet 100:139–146CrossRefGoogle Scholar
  33. Lee W-P, Stromberg MP, Ward A et al (2014) MOSAIK: a hash-based algorithm for accurate next-generation sequencing short-read mapping. PLoS ONE 9:e90581. CrossRefPubMedPubMedCentralGoogle Scholar
  34. Leroy T, Marraccini P, Dufour M et al (2005) Construction and characterization of a Coffea canephora BAC library to study the organization of sucrose biosynthesis genes. Theor Appl Genet 111:1032–1041. CrossRefPubMedGoogle Scholar
  35. Liao P-Y, Lee KH (2010) From SNPs to functional polymorphism: the insight into biotechnology applications. Biochem Eng J 49:149–158. CrossRefGoogle Scholar
  36. Marraccini P, Vinecky F, Alves GSC et al (2012) Differentially expressed genes and proteins upon drought acclimation in tolerant and sensitive genotypes of Coffea canephora. J Exp Bot 63:4191–4212. CrossRefPubMedPubMedCentralGoogle Scholar
  37. Metsalu T, Vilo J (2015) ClustVis: a web tool for visualizing clustering of multivariate data using principal component analysis and heatmap. Nucl Acids Res 43:W566–W570. CrossRefPubMedPubMedCentralGoogle Scholar
  38. Musoli P, Cubry P, Aluka P et al (2009) Genetic differentiation of wild and cultivated populations: diversity of Coffea canephora Pierre in Uganda. Genome 52:634–646. CrossRefPubMedGoogle Scholar
  39. Neves LG, Davis JM, Barbazuk WB, Kirst M (2013) Whole-exome targeted sequencing of the uncharacterized pine genome. Plant J 75:146–156. CrossRefPubMedGoogle Scholar
  40. Neves LG, Davis JM, Barbazuk WB, Kirst M (2014) A high-density gene map of loblolly pine (Pinus taeda L.) based on exome sequence capture genotyping. G3 4:29–37. CrossRefPubMedGoogle Scholar
  41. Ojopi EPB, Gregorio SP, Guimarães PEM et al (2004) O genoma humano e as perspectivas para o estudo da esquizofrenia. Rev Psiquiatr Clínica 31:9–18. CrossRefGoogle Scholar
  42. Pinto LA, Stein RT, Kabesch M (2008) Impact of genetics in childhood asthma. J Pediatr (Rio J) 84:S68–75. CrossRefGoogle Scholar
  43. Prakash NS, Combes M-C, Dussert S et al (2005) Analysis of genetic diversity in Indian robusta coffee genepool (Coffea canephora) in comparison with a representative core collection using SSRs and AFLPs. Genet Resour Crop Evol 52:333–343. CrossRefGoogle Scholar
  44. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959. PubMedPubMedCentralGoogle Scholar
  45. Ren J, Sun D, Chen L et al (2013) Genetic diversity revealed by single nucleotide polymorphism markers in a worldwide germplasm collection of durum wheat. Int J Mol Sci 14:7061–7088. CrossRefPubMedPubMedCentralGoogle Scholar
  46. Resende MDV, Lopes PS, Silva RL, Pires IE (2008) Seleção genômica ampla (GWS) e maximização da eficiência do melhoramento genético. Pesqui Florest Bras 56:63–77Google Scholar
  47. Resende M, Caixeta E, Alkimim ER et al (2016) High-throughput targeted genotyping of Coffea Arabica and Coffea Canephora using next generation sequencing. California, San Diego, p 1Google Scholar
  48. Sera T, Ruas PM, Ruas CDF et al (2003) Genetic polymorphism among 14 elite Coffea arabica L. cultivars using RAPD markers associated with restriction digestion. Genet Mol Biol 26:59–64. CrossRefGoogle Scholar
  49. Song J, Yang X, Resende MFR et al (2016) Natural allelic variations in highly polyploidy Saccharum complex. Front Plant Sci 7:1–18. Google Scholar
  50. Stacklies W, Redestig H, Scholz M et al (2007) pcaMethods a bioconductor package providing PCA methods for incomplete data. Bioinformatics 23:1164–1167. CrossRefPubMedGoogle Scholar
  51. Vieira LGE, Andrade AC, Colombo CA et al (2006) Brazilian coffee genome project: an EST-based genomic resource. Brazilian J Plant Physiol 18:95–108. CrossRefGoogle Scholar
  52. Yang W, Kang X, Yang Q et al (2013) Review on the development of genotyping methods for assessing farm animal diversity. J Anim Sci Biotechnol 4:2. CrossRefPubMedPubMedCentralGoogle Scholar
  53. Zhang P, Li J, Li X et al (2011) Population structure and genetic diversity in a rice core collection (Oryza sativa L.) investigated with SSR markers. PLoS ONE 6:e27565. CrossRefPubMedPubMedCentralGoogle Scholar
  54. Zhang J, Song Q, Cregan PB et al (2015) Genome-wide association study for flowering time, maturity dates and plant height in early maturing soybean (Glycine max) germplasm. BMC Genomics 16:1–11. CrossRefGoogle Scholar
  55. Zhou L, Vega FE, Tan H et al (2016) Developing single nucleotide polymorphism (SNP) markers for the identification of Coffee Germplasm. Trop Plant Biol 9:82–95. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  • Emilly Ruas Alkimim
    • 1
  • Eveline Teixeira Caixeta
    • 2
  • Tiago Vieira Sousa
    • 1
  • Felipe Lopes da Silva
    • 3
  • Ney Sussumu Sakiyama
    • 3
  • Laércio Zambolim
    • 4
  1. 1.BIOAGRO, BioCaféUniversidade Federal de ViçosaViçosaBrazil
  2. 2.Empresa Brasileira de Pesquisa Agropecuária - Embrapa Café, BIOAGRO, BioCaféUniversidade Federal de ViçosaViçosaBrazil
  3. 3.Departamento de FitotecniaUniversidade Federal de ViçosaViçosaBrazil
  4. 4.Departamento de FitopatologiaUniversidade Federal de ViçosaViçosaBrazil

Personalised recommendations