Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Population structure and genetic diversity of coffee progenies derived from Catuaí and Híbrido de Timor revealed by genome-wide SNP marker


The use of single nucleotide polymorphism (SNP) molecular markers has provided advances in selection methodologies used in breeding programs of different crops, reducing cost and time of cultivar release. Despite the great economic and social importance of Coffea arabica, studies with SNP markers are scarce and a small number of SNP are available for this species, when compared with other crops of agronomic importance. Thus, the objective of this study was to identify and validate SNP molecular markers for the species Coffea arabica and to introduce these markers to genetic breeding by means of an accurate analysis of the diversity and genetic structure of breeding populations of this species. After quality filtering, 11,187 SNP markers were selected from the coffee population obtained from crosses between the genotypes Catuaí and Híbrido de Timor. A great number of markers were distributed in the 11 chromosomes, within transcribed regions, and were used to estimate the genetic dissimilarity among the individuals of the breeding population. Dendrogram analysis and a Bayesian approach demonstrated the formation of two groups and the discrimination of all genotypes evaluated. The expressive number of SNP molecular markers distributed throughout C. arabica genome was efficient to discriminate all the accessions evaluated in the experiment, clustering them according to their genealogies. This work identified mixtures within the progenies. The genotyping data also provided detailed information about the parental genotypes and led to the identification of new candidate parents to be introduced to the breeding program. The study discussed population structure and its consequence in obtaining improved varieties of C. arabica.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. Aerts R, Berecha G, Gijbels P et al (2013) Genetic variation and risks of introgression in the wild Coffea arabica gene pool in south-western Ethiopian montane rainforests. Evol Appl 6:243–252. https://doi.org/10.1111/j.1752-4571.2012.00285.x

  2. Alkimim ER, Caixeta ET, Sousa TV et al (2017) Marker-assisted selection provides arabica coffee with genes from other Coffea species targeting on multiple resistance to rust and coffee berry disease. Mol Breed 37:6. https://doi.org/10.1007/s11032-016-0609-1

  3. Bertrand B, Anthony F, Lashermes P (2001) Breeding for resistance to Meloidogyne exigua in Coffea arabica by introgression of resistance genes of Coffea canephora. Plant Pathol 50:637–643. https://doi.org/10.1046/j.1365-3059.2001.00597.x

  4. Bertrand B, Guyot B, Anthony F, Lashermes P (2003) Impact of the Coffea canephora gene introgression on beverage quality of C. arabica. TAG Theor Appl Genet 107:387–394. https://doi.org/10.1007/s00122-003-1203-6

  5. Bettencourt A (1973) Considerações gerais sobre o híbrido de Timor: Origem e possibilidades de cultivo, 23rd edn. Instituto Agronômico, Campinas

  6. Bettencourt A, Rodrigues-Júnior C (1988) Principles and practice of coffee breeding for resistance to rust and other diseases. In: Clarke RJ, Macrae R (eds) Coffee agronomy. Elsevier Applied Science Publishers LTD, London, pp 199–234

  7. Bhering LL (2017) Rbio: a tool for biometric and statistical analysis using the R platform. Crop Breed Appl Biotechnol 17:187–190. https://doi.org/10.1590/1984-70332017v17n2s29

  8. Botstein D, White RL, Skolnick M, Davis RW (1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32:314–331

  9. Carvalho A, Krug CA (1949) Agentes de polinização da flor do cafeeiro (Coffea arabica L.) Bragantia. https://doi.org/10.1590/S0006-87051949000100002

  10. Ceccarelli S (2015) Efficiency of plant breeding. Crop Sci 55:87. https://doi.org/10.2135/cropsci2014.02.0158

  11. Clarindo WR, Carvalho CR (2008) First Coffea arabica karyogram showing that this species is a true allotetraploid. Plant Syst Evol 274:237–241. https://doi.org/10.1007/s00606-008-0050-y

  12. Combes MC, Andrzejewski S, Anthony F et al (2000) Characterization of microsatellite loci in Coffea arabica and related coffee species. Mol Ecol 9:1178–1180. https://doi.org/10.1046/j.1365-294x.2000.00954-5.x

  13. Crossa J, de los Campos G, Perez P et al (2010) Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 186:713–724. https://doi.org/10.1534/genetics.110.118521

  14. Cruz CD (2013) GENES—a software package for analysis in experimental statistics and quantitative genetics. Acta Scientiarum Agronomy 35:271–276. https://doi.org/10.4025/actasciagron.v35i3.21251

  15. Cubry P, Musoli P, Legnate H et al (2008) Diversity in coffee assessed with SSR markers: structure of the genus Coffea and perspectives for breeding. Genome 51:50–63. https://doi.org/10.1139/G07-096

  16. Danecek P, Auton A, Abecasis G et al (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158. https://doi.org/10.1093/bioinformatics/btr330

  17. Davis AP (2010) Six species of Psilanthus transferred to Coffea (Coffeeae, Rubiaceae). Phytotaxa 10:41–45

  18. Davis AP (2011) Psilanthus mannii, the type species of Psilanthus, transferred to Coffea. Nord J Bot 29:471–472. https://doi.org/10.1111/j.1756-1051.2011.01113.x

  19. Davis AP, Govaerts R, Bridson DM, Stoffelen P (2006) An annotated taxonomic conspectus of the genus Coffea (Rubiaceae). Bot J Linn Soc 152:465–512. https://doi.org/10.1111/j.1095-8339.2006.00584.x

  20. de Azevedo Peixoto L, Laviola BG, Alves AA et al (2017) Breeding Jatropha curcas by genomic selection: a pilot assessment of the accuracy of predictive models. PLoS One 12:e0173368. https://doi.org/10.1371/journal.pone.0173368

  21. Denoeud F, Carretero-Paulet L, Dereeper A et al (2014) The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science (New York, NY) 345:1181–1184. https://doi.org/10.1126/science.1255274

  22. Diniz LEC, Sakiyama NS, Lashermes P et al (2005) Analysis of AFLP markers associated to the Mex-1 resistance locus in Icatu progenies. Cropp Breed Appl Biotechnol 5:387–393. 10.12702/1984-7033.v05n04a03

  23. Diola V, de Brito GG, Caixeta ET et al (2011) High-density genetic mapping for coffee leaf rust resistance. Tree Genet Genomes 7:1199–1208. https://doi.org/10.1007/s11295-011-0406-2

  24. Elsik CG, Tellam RL, Worley KC (2009) The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science (New York, NY) 324:522–528. https://doi.org/10.1126/science.1169588

  25. Eskes AB (1989) Resistance. In: Kushalapa AC, Eskes AB (eds) Coffee rust: epidemiology, resistance and management. CRC Press, Boca Raton, FL, pp 171–293

  26. Ferrão LFV, Caixeta ET, Pena G et al (2015) New EST–SSR markers of Coffea arabica: transferability and application to studies of molecular characterization and genetic mapping. Mol Breed 35:31. https://doi.org/10.1007/s11032-015-0247-z

  27. Fontes JRM, Sakiyama NS, Cardoso AA et al (2002) Avaliação de híbridos F1 de café (coffee arabica L.) e respectivos progenitores com marcadores RAPD/Evaluation of F1 coffe hybrids (Coffee arabica L.) and their respective progenitors with RAPD markers. Ceres 49:283–294

  28. Gao H, Williamson S, Bustamante CD (2007) A Markov chain Monte Carlo approach for joint inference of population structure and inbreeding rates from multilocus genotype data. Genetics 176:1635–1651. https://doi.org/10.1534/genetics.107.072371

  29. Garcia C, Lima B, Almeida A et al (2011) Genome wide selection for Eucalyptus improvement at international paper in Brazil. BMC Proc 5:44. https://doi.org/10.1186/1753-6561-5-S7-P44

  30. Garrison E, Gabor M (2012) Haplotype-based variant detection from short-read sequencing

  31. Gichimu B, Gicheru E, Mamati G, Nyende A (2013) Variation and association of cup quality attributes and resistance to Coffee Berry Disease in Coffea arabica L. composite cultivar, Ruiru 11. African Journal of Hortic Sci 7:22–35

  32. Gichimu BM, Gichuru EK, Mamati GE, Nyende AB (2014) Occurrence of Ck-1 gene conferring resistance to Coffee Berry Disease in Coffea arabica cv. Ruiru 11 and its parental genotypes. J Agric Crop Res 2:51–61

  33. Gnirke A, Melnikov A, Maguire J et al (2009) Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol 27:182–189. https://doi.org/10.1038/nbt.1523

  34. Hallauer AR (2011) Evolution of plant breeding. Crop Breed Appl Biotechnol 11:197–206. https://doi.org/10.1590/S1984-70332011000300001

  35. Heffner EL, Sorrells ME, Jannink J-L (2009) Genomic selection for crop improvement. Crop Sci 49(1). https://doi.org/10.2135/cropsci2008.08.0512

  36. Hendre P, Phanindranath R, Annapurna V et al (2008) Development of new genomic microsatellite markers from robusta coffee (Coffea canephora Pierre ex A. Froehner) showing broad cross-species transferability and utility in genetic studies. BMC Plant Biol 8:51. https://doi.org/10.1186/1471-2229-8-51

  37. Heslot N, Yang H-P, Sorrells ME, Jannink J-L (2012) Genomic selection in plant breeding: a comparison of models. Crop Sci 52:146. https://doi.org/10.2135/cropsci2011.06.0297

  38. ICO (2017) International coffee organization. In: Note regarding the review of statistical data published by the ICO. http://www.ico.org/prices/po-production.pdf. Accessed 6 Apr 2017

  39. Inácio P, Lewinsohn T, do Carmo RL, Hogan DJ (2002) Ordenação multivariada na ecologia e seu uso em ciências ambientais. Ambiente & Sociedade 69–83. doi: https://doi.org/10.1590/S1414-753X2002000100005

  40. Kumar S, Stecher G, Tamura K (2016) MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Mol Biol Evol 33:1870–1874. https://doi.org/10.1093/molbev/msw054

  41. Lashermes P, Andrzejewski S, Bertrand B et al (2000) Molecular analysis of introgressive breeding in coffee (Coffea arabica L.) TAG Theor Appl Genet 100:139–146. https://doi.org/10.1007/s001220050019

  42. Lashermes P, Combes M-C, Robert J et al (1999) Molecular characterisation and origin of the Coffea arabica L. genome. Mol Gen Genet MGG 261:259–266. https://doi.org/10.1007/s004380050965

  43. Lashermes P, Combes MC, Ansaldi C et al (2011) Analysis of alien introgression in coffee tree (Coffea arabica L.) Mol Breed 27:223–232. https://doi.org/10.1007/s11032-010-9424-2

  44. Lashermes P, Cros J, Marmey P, Charrier A (1993) Use of random amplified DNA markers to analyse genetic variability and relationships of Coffea species. Genet Resour Crop Evol 40:91–99. https://doi.org/10.1007/BF00052639

  45. Laurie CC, Doheny KF, Mirel DB et al (2010) Quality control and quality assurance in genotypic data for genome-wide association studies. Genet Epidemiol 34:591–602. https://doi.org/10.1002/gepi.20516

  46. Lee W-P, Stromberg MP, Ward A et al (2014) MOSAIK: a hash-based algorithm for accurate next-generation sequencing short-read mapping. PLoS One 9:e90581. https://doi.org/10.1371/journal.pone.0090581

  47. Maluf MP, Silvestrini M, Ruggiero LM de C et al (2005) Genetic diversity of cultivated Coffea arabica inbred lines assessed by RAPD, AFLP and SSR marker systems. Sci Agric 62:366–373. https://doi.org/10.1590/S0103-90162005000400010

  48. Matukumalli LK, Lawley CT, Schnabel RD et al (2009) Development and characterization of a high density SNP genotyping assay for cattle. PLoS One 4:e5350. https://doi.org/10.1371/journal.pone.0005350

  49. Missio RF, Caixeta ET, Zambolim EM et al (2009a) Development and validation of SSR markers for Coffea arabica L. Cropp Breed Appl Biotechnol 9:361–371. 10.12702/1984-7033.v09n04a11

  50. Missio RF, Caixeta ET, Zambolim EM et al (2011) Genetic characterization of an elite coffee germplasm assessed by gSSR and EST-SSR markers. Genet Mol Res 10:2366–2381. https://doi.org/10.4238/2011.October.6.2

  51. Missio RF, Caixeta ET, Zambolim EM et al (2009b) Assessment of EST-SSR markers for genetic analysis on coffee. Bragantia 68:573–581. https://doi.org/10.1590/S0006-87052009000300003

  52. Moncada MDP, Tovar E, Montoya JC et al (2016) A genetic linkage map of coffee (Coffea arabica L.) and QTL for yield, plant height, and bean size. Tree Genet Genomes 12:5. https://doi.org/10.1007/s11295-015-0927-1

  53. Moncada P, McCouch S (2004) Simple sequence repeat diversity in diploid and tetraploid Coffea species. Genome 47:501–509. https://doi.org/10.1139/g03-129

  54. Neves LG, Davis JM, Barbazuk WB, Kirst M (2013) Whole-exome targeted sequencing of the uncharacterized pine genome. Plant J 75:146–156. https://doi.org/10.1111/tpj.12193

  55. Oliveira ACB, Sakiyama NS, Caixeta ET et al (2007) Partial map of Coffea arabica L. and recovery of the recurrent parent in backcross progenies. Crop Breed Appl Biotechnol 7:196–203

  56. Ortiz R, Lund B, Andersen SB (2003) Breeding gains and changes in morphotype of Nordic spring wheat (1901-1993) under contrasting environments. Genet Resour Crop Evol 50:455–459. https://doi.org/10.1023/A:1023902110224

  57. Pearl HM, Nagai C, Moore PH et al (2004) Construction of a genetic map for arabica coffee. TAG Theor Appl Genet 108:829–835. https://doi.org/10.1007/s00122-003-1498-3

  58. Pereira AA, Carvalho GR, Moura WM et al (2010a) Cultivares: Origem e suas Características. In: Reis PR, Cunha RL (eds) Café arábica do plantio à colheita. EPAMIG, Lavras, pp 167–221

  59. Pereira AA, Oliveira ACB, Sakiyama NS (2008) Híbrido de Timor como fonte de resistência a doenças e de qualidade da bebida do cafeeiro. In: Fernandes LH (ed) Manejo Fitossanitário da Cultura do Cafeeiro. Sociedade Brasileira de Fitopatologia, Brasília-DF, pp 13–24

  60. Pereira MC, Chalfoun SM, de Carvalho GR, Savian TV (2010b) Multivariate analysis of sensory characteristics of coffee grains (Coffea arabica L.) in the region of upper Paranaíba. Acta Sci Agron. https://doi.org/10.4025/actasciagron.v32i4.4283

  61. Pereira TB, Setotaw TA, Santos DN et al (2016) Identification of microsatellite markers in coffee associated with resistance to Meloidogyne exigua. Genet Mol Res. https://doi.org/10.4238/gmr.15038054

  62. Pestana KN, Capucho AS, Caixeta ET et al (2015) Inheritance study and linkage mapping of resistance loci to Hemileia vastatrix in Híbrido de Timor UFV 443-03. Tree Genet Genomes 11:72. https://doi.org/10.1007/s11295-015-0903-9

  63. Poncet V, Rondeau M, Tranchant C et al (2006) SSR mining in coffee tree EST databases: potential use of EST–SSRs as markers for the Coffea genus. Mol Gen Genomics 276:436–449. https://doi.org/10.1007/s00438-006-0153-5

  64. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959. https://doi.org/10.1111/j.1471-8286.2007.01758.x

  65. Resende M, Caixeta E, Alkimim ER, et al (2016) High-throughput targeted genotyping of Coffea arabica and Coffea canephora using next generation sequencing. San Diego, CA, p 1

  66. Resende MFRJ, Muñoz P, Acosta JJ et al (2012a) Accelerating the domestication of trees using genomic selection: accuracy of prediction models across ages and environments. New Phytol 193:617–624. https://doi.org/10.1111/j.1469-8137.2011.03895.x

  67. Resende MFRJ, Munoz P, Resende MDV et al (2012b) Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.) Genetics 190:1503–1510. https://doi.org/10.1534/genetics.111.137026

  68. Rodgers DM, Murphy JP, Frey KJ (1983) Impact of plant breeding on the grain yield and genetic diversity of spring oats. Crop Sci 23:737. https://doi.org/10.2135/cropsci1983.0011183X002300040032x

  69. Romero G, Vásquez LM, Lashermes P, Herrera JC (2014) Identification of a major QTL for adult plant resistance to coffee leaf rust (Hemileia vastatrix) in the natural Timor hybrid (Coffea arabica x C. canephora). Plant Breed 133:121–129. https://doi.org/10.1111/pbr.12127

  70. Rovelli P, Mettulio R, Anthony F et al (2000) Microsatellites in Coffea arabica L. In: Sera T, Soccol C, Pandey A, Roussos S (eds) Coffee biotechnology and quality. Springer Netherlands, Dordrecht, pp 123–133

  71. Setotaw TA, Caixeta ET, Pena GF et al (2010) Breeding potential and genetic diversity of “Híbrido do Timor” coffee evaluated by molecular markers. Crop Breed Appl Biotechnol 10:298–304. https://doi.org/10.1590/S1984-70332010000400003

  72. Setotaw TA, Caixeta ET, Pereira AA et al (2013) Coefficient of parentage in Coffea arabica L. cultivars grown in Brazil. Crop Sci 53:1237–1247. https://doi.org/10.2135/cropsci2012.09.0541

  73. Smith JSC, Duvick DN, Smith OS et al (2004) Changes in pedigree backgrounds of Pioneer Brand maize hybrids widely grown from 1930 to 1999. Crop Sci 44:1935. https://doi.org/10.2135/cropsci2004.1935

  74. Sobreira FM, de Oliveira ACB, Pereira AA et al (2015) Sensory quality of arabica coffee (Coffea arabica) genealogic groups using the sensogram and content analysis. Aust J Crop Sci 9:486–493

  75. Sousa TVTV, Caixeta ETET, Alkimim ERER et al (2017) Molecular markers useful to discriminate Coffea arabica cultivars with high genetic similarity. Euphytica 213:75. https://doi.org/10.1007/s10681-017-1865-9

  76. Ventorim Ferrão LF, Gava Ferrão R, Ferrão MAG et al (2017) A mixed model to multiple harvest-location trials applied to genomic prediction in Coffea canephora. Tree Genet Genomes 13:95. https://doi.org/10.1007/s11295-017-1171-7

  77. Vidal RO, Mondego JMC, Pot D et al (2010) A high-throughput data mining of single nucleotide polymorphisms in Coffea species expressed sequence tags suggests differential homeologous gene expression in the allotetraploid Coffea arabica. Plant Physiol 154:1053–1066. https://doi.org/10.1104/pp.110.162438

  78. Vieira ESN, Von Pinho ÉV d R, Carvalho MGG et al (2010) Development of microsatellite markers for identifying Brazilian Coffea arabica varieties. Genet Mol Biol 33:507–514. https://doi.org/10.1590/S1415-47572010005000055

  79. Vieira LGE, Andrade AC, Colombo CA et al (2006) Brazilian coffee genome project: an EST-based genomic resource. Braz J Plant Physiol 18:95–108. https://doi.org/10.1590/S1677-04202006000100008

  80. Yang H-C, Lin H-C, Kang M et al (2011) SAQC: SNP array quality control. BMC Bioinf 12:100. https://doi.org/10.1186/1471-2105-12-100

  81. Zambolim L (2016) Current status and management of coffee leaf rust in Brazil. Trop Plant Pathol 41:1–8. https://doi.org/10.1007/s40858-016-0065-9

Download references


This work was financially supported by the Brazilian Coffee Research and Development Consortium (Consórcio Brasileiro de Pesquisa e Desenvolvimento do Café—CBP&D/Café), by the Foundation for Research Support of the state of Minas Gerais (FAPEMIG), by the National Council of Scientific and Technological Development (CNPq), and by the National Institutes of Science and Technology of Coffee (INCT/Café).

Author information

Correspondence to Eveline Teixeira Caixeta.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Data archiving statement

The authors have not submitted biological data to any of the public databases.

Additional information

Communicated by P. Ingvarsson

Electronic supplementary material


(XLSX 249 kb).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sousa, T.V., Caixeta, E.T., Alkimim, E.R. et al. Population structure and genetic diversity of coffee progenies derived from Catuaí and Híbrido de Timor revealed by genome-wide SNP marker. Tree Genetics & Genomes 13, 124 (2017). https://doi.org/10.1007/s11295-017-1208-y

Download citation


  • Coffea arabica
  • Introgression
  • Next-generation sequence
  • Genetic relationships
  • Molecular breeding
  • InStruct