Skip to main content
Log in

The effect of SNP discovery method and sample size on estimation of population genetic data for Chinese and Indian rhesus macaques (Macaca mulatta)

  • Original Article
  • Published:
Primates Aims and scope Submit manuscript

Abstract

This study was designed to address issues regarding sample size and marker location that have arisen from the discovery of SNPs in the genomes of poorly characterized primate species and the application of these markers to the study of primate population genetics. We predict the effect of discovery sample size on the probability of discovering both rare and common SNPs and then compare this prediction with the proportion of common and rare SNPs discovered when different numbers of individuals are sequenced. Second, we examine the effect of genomic region on estimates of common population genetic data, comparing markers from both coding and non-coding regions of the rhesus macaque genome and the population genetic data calculated from these markers, to measure the degree and direction of bias introduced by SNPs located in coding versus non-coding regions of the genome. We found that both discovery sample size and genomic region surveyed affect SNP marker attributes and population genetic estimates, even when these are calculated from an expanded data set containing more individuals than the original discovery data set. Although none of the SNP detection methods or genomic regions tested in this study was completely uninformative, these results show that each has a different kind of genetic variation that is suitable for different purposes, and each introduces specific types of bias. Given that each SNP marker has an individual evolutionary history, we calculated that the most complete and unbiased representation of the genetic diversity present in the individual can be obtained by incorporating at least 10 individuals into the discovery sample set, to ensure the discovery of both common and rare polymorphisms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Aitken N, Smith S, Schwarz C, Morin PA (2004) Single nucleotide polymorphism (SNP) discovery in mammals: a targeted-gene approach. Mol Ecol 13:1423–1431

    Article  PubMed  CAS  Google Scholar 

  • Akey JM, Zhang K, Xiong M, Jin L (2003) The effect of single nucleotide polymorphism identification strategies on estimates of linkage disequilibrium. Mol Biol Evol 20:232–242

    Article  PubMed  CAS  Google Scholar 

  • Clark AG, Hubisz MJ, Bustamante CD, Williamson SH, Neilsen R (2005) Ascertainment bias in studies of human-genome wide polymorphism. Genome Res 15:1496–1502

    Article  PubMed  CAS  Google Scholar 

  • Ferguson B, Street SL, Wright H, Pearson C, Jia Y, Thompson SL, Allibone P, Dubay CJ, Spindel E, Norgren RB (2007) Single nucleotide polymorphisms (SNPs) distinguish Indian-origin and Chinese-origin rhesus macaques (Macaca mulatta). BMC Genom 8:43

    Article  Google Scholar 

  • Hernandez RD, Hubisz MJ, Wheeler D, Smith DG, Ferguson B, Rogers J, Nazareth L, Indap A, Bourquin T, McPherson J, Muzny D, Gibbs R, Nielsen R, Bustamante CD (2007) Demographic histories and patterns of linkage disequilibrium for Chinese and Indian rhesus macaques. Science 316:240–243

    Article  PubMed  CAS  Google Scholar 

  • Hoffman JI, Amos W (2005) Microsatellite genotyping errors: detection approaches, common sources and consequences for paternal exclusion. Mol Ecol 14:599–612

    Article  PubMed  CAS  Google Scholar 

  • Jakobsson M, Scholz SW, Scheet P, Gibbs JR, VanLiere JM, Fung HC, Szpiech ZA, Degnan JH, Wang K, Guerreiro R, Bras JM, Schymick JC, Hernandez DG, Traynor BJ, Simon-Sanchez J, Matarin M, Britton A, van de Leemput J, Rafferty I, Bucan M, Cann HM, Hardy JA, Rosenberg NA, Singleton AB (2008) Genotype, haplotype, and copy-number variation in worldwide human populations. Nature 451:998–1003

    Article  PubMed  CAS  Google Scholar 

  • Jombart T (2008) adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24:1403–1405

    Article  PubMed  CAS  Google Scholar 

  • Kanthaswamy S, Gill L, Satkoski J, Goyal V, Malladi V, Kou A, Basuta K, Sarkisyan L, George D, Smith DG (2009) The development of a Chinese–Indian hybrid (Chindian) rhesus macaque colony at the California National Primate Research Center (CNPRC) by introgression. J Med Primatol 38:86–96

    Article  PubMed  CAS  Google Scholar 

  • Kanthaswamy S, Satkoski J, Kou A, Malladi V, Smith DG (2010) Detecting signatures of inter-regional and inter-specific hybridization among the Chinese rhesus macaque specific pathogen-free (SPF) population using single nucleotide polymorphic (SNP) markers. J Med Primatol 39:252–265

    Article  PubMed  CAS  Google Scholar 

  • Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, Myers RM (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319:1100–1104

    Article  PubMed  CAS  Google Scholar 

  • Malhi RS, Sickler B, Lin D, Satkoski J, Tito RY, George D, Kanthaswamy S, Smith DG (2007) MamuSNP: a SNP resource for rhesus macaques (Macaca mulatta). PLOs ONE 2:e438

    Article  PubMed  Google Scholar 

  • Morin PA, Smith DG, Kanthaswamy S (1997) Simple sequence repeat (SSR) polymorphisms for colony management and population genetics in rhesus macaques (Macaca mulatta). Am J Primatol 44:199–213

    Article  Google Scholar 

  • Morin PA, Luikart G, Wayne RK et al (2004) SNPs in ecology, evolution and conservation. Trends Ecol Evol 19:208–216

    Article  Google Scholar 

  • Nielsen R, Hubisz MJ, Clark AG (2004) Reconstituting the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics 168:2373–2382

    Article  PubMed  CAS  Google Scholar 

  • Penedo MCT, Bontrop RE, Heijmans CMC, Otting N, Noort R, Rouweler AJM, de Groot N, de Groot NG, Ward T, Doxiadis GGM (2003) Microsatellite typing of the rhesus macaque MHC region. Immunogenetics 55:198–209

    Google Scholar 

  • Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Hum Genet 81:559–575

    Article  PubMed  CAS  Google Scholar 

  • Raymond M, Rousset F (1995) GENEPOP (version 1.2): population genetics software for exact tests and ecumenicism. J Hered 86:248–249

    Google Scholar 

  • Rhesus Macaque Genome Sequencing and Analysis Consortium (2007) Evolutionary and biomedical insights from the rhesus macaque genome. Science 316:222–234

    Article  Google Scholar 

  • Rogers J, Bergstrom M, Garcia R, Kaplan J, Arya A, Novakowski L, Johnson Z, Vinson A, Shelledy W (2005) A panel of 20 highly variable microsatellite polymorphisms in rhesus macaques (Macaca mulatta) selected for pedigree or population genetic analysis. Am J Primatol 67:377–383

    Article  PubMed  CAS  Google Scholar 

  • Rogers J, Garcia R, Shelledy W, Kaplan J, Arya A, Johnson Z, Bergstrom M, Novakowski L, Nair P, Vinson A, Newman D, Heckman G, Cameron J (2006) An initial genetic linkage map of the rhesus macaque (Macaca mulatta) genome using human microsatellite loci. Genomics 87:30–38

    Article  PubMed  CAS  Google Scholar 

  • Satkoski JA, George D, Smith DG, Kanthaswamy S (2008a) Genetic characterization of wild and captive rhesus macaques in China. J Med Primatol 37:67–80

    Article  PubMed  CAS  Google Scholar 

  • Satkoski JA, Malhi RS, Kanthaswamy S, Tito RY, Malladi VS, Smith DG (2008b) Pyrosequencing as a method for SNP identification in the rhesus macaque (Macaca mulatta). BMC Genom 9:256

    Article  Google Scholar 

  • Siepel A (2009) Phylogenomics of primates and their ancestral populations. Genome Res 19:1929–1941

    Google Scholar 

  • Smith DG, McDonough J (2005) Mitochondrial DNA variation in Chinese and Indian rhesus macaques (Macaca mulatta). Am J Primatol 65:1–25

    Article  PubMed  CAS  Google Scholar 

  • Smith DG, George D, Kanthaswamy S, McDonough J (2006) Identification of country of origin and admixture between Indian and Chinese rhesus macaques. Int J Primatol 27:881–898

    Article  Google Scholar 

  • Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA (2001) The sequence of the human genome. Science 291:1304–1351

    Article  PubMed  CAS  Google Scholar 

  • Wakeley J, Nielsen R, Liu-Cordero SN, Ardlie K (2001) The discovery of single-nucleotide polymorphisms––and inferences about human demographic history. Am J Hum Genet 69:1332–1347

    Article  PubMed  CAS  Google Scholar 

  • Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358–1370

    Article  Google Scholar 

  • Williams-Blangero S (1993) Research-oriented genetic management of nonhuman primate colonies. Lab Anim Sci 43:535–540

    PubMed  CAS  Google Scholar 

Download references

Acknowledgments

The authors would like to acknowledge the laboratory assistance provided by Debra George and Joy Erickson. This work was funded by a grant from the National Institutes of Health, no. RR05090, awarded to DGS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jessica A. Satkoski Trask.

About this article

Cite this article

Trask, J.A.S., Malhi, R.S., Kanthaswamy, S. et al. The effect of SNP discovery method and sample size on estimation of population genetic data for Chinese and Indian rhesus macaques (Macaca mulatta). Primates 52, 129–138 (2011). https://doi.org/10.1007/s10329-010-0232-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10329-010-0232-4

Keywords

Navigation