Human Genetics

, Volume 133, Issue 10, pp 1273–1287 | Cite as

Population and genomic lessons from genetic analysis of two Indian populations

  • Garima Juyal
  • Mayukh Mondal
  • Pierre Luisi
  • Hafid Laayouni
  • Ajit Sood
  • Vandana Midha
  • Peter Heutink
  • Jaume Bertranpetit
  • B. K. Thelma
  • Ferran Casals
Original Investigation

Abstract

Indian demographic history includes special features such as founder effects, interpopulation segregation, complex social structure with a caste system and elevated frequency of consanguineous marriages. It also presents a higher frequency for some rare mendelian disorders and in the last two decades increased prevalence of some complex disorders. Despite the fact that India represents about one-sixth of the human population, deep genetic studies from this terrain have been scarce. In this study, we analyzed high-density genotyping and whole-exome sequencing data of a North and a South Indian population. Indian populations show higher differentiation levels than those reported between populations of other continents. In this work, we have analyzed its consequences, by specifically assessing the transferability of genetic markers from or to Indian populations. We show that there is limited genetic marker portability from available genetic resources such as HapMap or the 1,000 Genomes Project to Indian populations, which also present an excess of private rare variants. Conversely, tagSNPs show a high level of portability between the two Indian populations, in contrast to the common belief that North and South Indian populations are genetically very different. By estimating kinship from mates and consanguinity in our data from trios, we also describe different patterns of assortative mating and inbreeding in the two populations, in agreement with distinct mating preferences and social structures. In addition, this analysis has allowed us to describe genomic regions under recent adaptive selection, indicating differential adaptive histories for North and South Indian populations. Our findings highlight the importance of considering demography for design and analysis of genetic studies, as well as the need for extending human genetic variation catalogs to new populations and particularly to those with particular demographic histories.

Notes

Acknowledgments

We thank Lara Nonell and Eulàlia Puigdecanet from the Servei d’Anàlisi de Microarrays (IMIM) for their invaluable help. We would like to acknowledge David Sondervan and Ingrid Bakker from the section Medical Genomics of the VUMC for sequencing of the samples. We thank Dr. A. R Rao and Dr. Namita Sidhu from IASRI, New Delhi, India for statistical assistance in the early part of the study. We deeply thank Txema Heredia, Ángel Carreño and Jordi Rambla for computational support, Marc Pybus for his help in the selection analysis, and David Comas for critical reading of the manuscript. International fellowship funded by Center for Neurogenomics and Cognitive Research (CNCR), VU, Amsterdam, The Netherlands to GJ; Research grant from J C Bose fellowship to BKT; grant # BT/01/COE/07/UDSC to BKT and salary support to GJ are gratefully acknowledged. FC was supported by a Beatriu de Pinós (2010-BP- B-00128) fellowship and MM by a PhD grant both from AGAUR (Generalitat de Catalunya). Funding to FC by grant SAF2012-35025 from the Ministerio de Economía y Competitividad (Spain); Funding to JB by grants BFU2010-19443 from the Ministerio de Ciencia y Tecnología (Spain), PRI-PIBIN-2011-0942 from the Ministerio de Economía y Competitividad (Spain), and from the Direcció General de Recerca, Generalitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101).

Supplementary material

439_2014_1462_MOESM1_ESM.docx (132 kb)
Supplementary material 1 (DOCX 131 kb)

References

  1. Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19:1655–1664. doi:10.1101/gr.094052.109 PubMedCentralPubMedCrossRefGoogle Scholar
  2. Al-Kandari W, Jambunathan S, Navalgund V et al (2007) ZXDC, a novel zinc finger protein that binds CIITA and activates MHC gene transcription. Mol Immunol 44:311–321. doi:10.1016/j.molimm.2006.02.029 PubMedCentralPubMedCrossRefGoogle Scholar
  3. Al-Mayouf SM, Sunker A, Abdwani R et al (2011) Loss-of-function variant in DNASE1L3 causes a familial form of systemic lupus erythematosus. Nat Genet 43:1186–1188. doi:10.1038/ng.975 PubMedCrossRefGoogle Scholar
  4. Balaresque PL, Ballereau SJ, Jobling MA (2007) Challenges in human genetic diversity: demographic history and adaptation. Hum Mol Genet 16 Spec No:R134–R139. doi:10.1093/hmg/ddm242 CrossRefGoogle Scholar
  5. Bamshad MJ, Ng SB, Bigham AW et al (2011) Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet 12:745–755. doi:10.1038/nrg3031 PubMedCrossRefGoogle Scholar
  6. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265PubMedCrossRefGoogle Scholar
  7. Basu Mallick C, Iliescu FM, Möls M et al (2013) The light skin allele of SLC24A5 in South Asians and Europeans shares identity by descent. PLoS Genet 9:e1003912. doi:10.1371/journal.pgen.1003912 PubMedCentralPubMedCrossRefGoogle Scholar
  8. Basu A, Mukherjee N, Roy S et al (2003) Ethnic India: a genomic view, with special reference to peopling and structure. Genome Res 13:2277–2290. doi:10.1101/gr.1413403 PubMedCentralPubMedCrossRefGoogle Scholar
  9. Bittles AH (2010) Consanguinity, genetic drift, and genetic diseases in populations with reduced numbers of founders. In: Speicher MR, Stylianos E, Antonarakis AGM (eds) Vogel Motulsky’s human genetics problem approaches. Springer-Verlag, Berlin, pp 507–528CrossRefGoogle Scholar
  10. Bosch E, Laayouni H, Morcillo-Suarez C et al (2009) Decay of linkage disequilibrium within genes across HGDP-CEPH human samples: most population isolates do not show increased LD. BMC Genom 10:338. doi:10.1186/1471-2164-10-338 CrossRefGoogle Scholar
  11. Bowdish DM, Sakamoto K, Lack NA et al (2013) Genetic variants of MARCO are associated with susceptibility to pulmonary tuberculosis in a Gambian population. BMC Med Genet 14:47. doi:10.1186/1471-2350-14-47 PubMedCentralPubMedCrossRefGoogle Scholar
  12. Bustamante CD, Burchard EG, De la Vega FM (2011) Genomics for the world. Nature 475:163–165PubMedCentralPubMedCrossRefGoogle Scholar
  13. Cann HM, de Toma C, Cazes L et al (2002) A human genome diversity cell line panel. Science 80(296):261–262CrossRefGoogle Scholar
  14. Carlson CS, Eberle MA, Rieder MJ et al (2004) Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 74:106–120PubMedCentralPubMedCrossRefGoogle Scholar
  15. Casals F, Bertranpetit J (2012) Genetics. Human genetic variation, shared and private. Science 337:39–40. doi:10.1126/science.1224528 PubMedCrossRefGoogle Scholar
  16. Casals F, Sikora M, Laayouni H et al (2011) Genetic adaptation of the antibacterial human innate immunity network. BMC Evol Biol 11:202. doi:10.1186/1471-2148-11-202 PubMedCentralPubMedCrossRefGoogle Scholar
  17. Casals F, Hodgkinson A, Hussin J et al (2013) Whole-exome sequencing reveals a rapid change in the frequency of rare functional variants in a founding population of humans. PLoS Genet 9:e1003815. doi:10.1371/journal.pgen.1003815 PubMedCentralPubMedCrossRefGoogle Scholar
  18. Chadha VK, Kumar P, Jagannatha PS et al (2005) Average annual risk of tuberculous infection in India. Int J Tuberc Lung Dis 9:116–118Google Scholar
  19. Chakrabarti B, Kumar S, Singh R, Dimitrova N (2012) Genetic diversity and admixture patterns in Indian populations. Gene 508:250–255. doi:10.1016/j.gene.2012.07.047 PubMedCrossRefGoogle Scholar
  20. Consortium IGV (2008) Genetic landscape of the people of India: a canvas for disease gene exploration. J Genet 87:3–20CrossRefGoogle Scholar
  21. Consortium TIGV (2005) The Indian Genome Variation database (IGVdb): a project overview. Hum Genet 118:1–11. doi:10.1007/s00439-005-0009-9 CrossRefGoogle Scholar
  22. Court N, Vasseur V, Vacher R et al (2010) Partial redundancy of the pattern recognition receptors, scavenger receptors, and C-type lectins for the long-term control of Mycobacterium tuberculosis infection. J Immunol 184:7057–7070. doi:10.4049/jimmunol.1000164 PubMedCrossRefGoogle Scholar
  23. Coventry A, Bull-Otterson LM, Liu X et al (2010) Deep resequencing reveals excess rare recent variants consistent with explosive population growth. Nat Commun 1:131PubMedCentralPubMedCrossRefGoogle Scholar
  24. Delaneau O, Marchini J, Zagury J-F (2012) A linear complexity phasing method for thousands of genomes. Nat Methods 9:179–181. doi:10.1038/nmeth.1785 CrossRefGoogle Scholar
  25. DePristo MA, Banks E, Poplin R et al (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43:491–498PubMedCentralPubMedCrossRefGoogle Scholar
  26. Fu W, O’Connor TD, Jun G et al (2012) Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493:216–220PubMedCentralPubMedCrossRefGoogle Scholar
  27. Gonzalez-Neira A, Ke X, Lao O et al (2006) The portability of tagSNPs across populations: a worldwide survey. Genome Res 16:323–330PubMedCentralPubMedCrossRefGoogle Scholar
  28. Gravel S, Henn BM, Gutenkunst RN et al (2011) Demographic history and rare allele sharing among human populations. Proc Natl Acad Sci USA 108:11983–11988PubMedCentralPubMedCrossRefGoogle Scholar
  29. Izagirre N, García I, Junquera C et al (2006) A scan for signatures of positive selection in candidate loci for skin pigmentation in humans. Mol Biol Evol 23:1697–1706. doi:10.1093/molbev/msl030 PubMedCrossRefGoogle Scholar
  30. Juyal G, Amre D, Midha V et al (2007) Evidence of allelic heterogeneity for associations between the NOD2/CARD15 gene and ulcerative colitis among North Indians. Aliment Pharmacol Ther 26:1325–1332. doi:10.1111/j.1365-2036.2007.03524.x PubMedCrossRefGoogle Scholar
  31. Juyal G, Midha V, Amre D et al (2009) Associations between common variants in the MDR1 (ABCB1) gene and ulcerative colitis among North Indians. Pharmacogenet Genomics 19:77–85. doi:10.1097/FPC.0b013e32831a9abe PubMedCrossRefGoogle Scholar
  32. Juyal G, Prasad P, Senapati S et al (2011) An investigation of genome-wide studies reported susceptibility loci for ulcerative colitis shows limited replication in North Indians. PLoS One 6:e16565. doi:10.1371/journal.pone.0016565 PubMedCentralPubMedCrossRefGoogle Scholar
  33. Keinan A, Clark AG (2012) Recent explosive human population growth has resulted in an excess of rare genetic variants. Science 80(336):740–743CrossRefGoogle Scholar
  34. Kennedy RB, Ovsyannikova IG, Pankratz VS et al (2012) Genome-wide analysis of polymorphisms associated with cytokine responses in smallpox vaccine recipients. Hum Genet 131:1403–1421. doi:10.1007/s00439-012-1174-2 PubMedCrossRefGoogle Scholar
  35. Kryukov GV, Pennacchio LA, Sunyaev SR (2007) Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am J Hum Genet 80:727–739. doi:10.1086/513473 PubMedCentralPubMedCrossRefGoogle Scholar
  36. Laayouni H, Oosting M, Luisi P et al (2014) Convergent evolution in European and Rroma populations reveals pressure exerted by plague on toll-like receptors. Proc Natl Acad Sci USA 111:2668–2673. doi:10.1073/pnas.1317723111 PubMedCentralPubMedCrossRefGoogle Scholar
  37. Lamason RL, Mohideen M-APK, Mest JR et al (2005) SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science 310:1782–1786. doi:10.1126/science.1116238 PubMedCrossRefGoogle Scholar
  38. Leutenegger A-L, Sahbatou M, Gazal S et al (2011) Consanguinity around the world: what do the genomic data of the HGDP-CEPH diversity panel tell us? Eur J Hum Genet 19:583–587. doi:10.1038/ejhg.2010.205 PubMedCentralPubMedCrossRefGoogle Scholar
  39. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760PubMedCentralPubMedCrossRefGoogle Scholar
  40. Li Y, Vinckenbosch N, Tian G et al (2010) Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nat Genet 42:969–972. doi:10.1038/ng.680 PubMedCrossRefGoogle Scholar
  41. Manolio TA, Collins FS, Cox NJ et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753. doi:10.1038/nature08494 PubMedCentralPubMedCrossRefGoogle Scholar
  42. Marth GT, Yu F, Indap AR et al (2011) The functional spectrum of low-frequency coding variation. Genome Biol 12:R84. doi:10.1186/gb-2011-12-9-r84 PubMedCentralPubMedCrossRefGoogle Scholar
  43. McKemy DD, Neuhausser WM, Julius D (2002) Identification of a cold receptor reveals a general role for TRP channels in thermosensation. Nature 416:52–58. doi:10.1038/nature719 PubMedCrossRefGoogle Scholar
  44. McKenna A, Hanna M, Banks E et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303. doi:10.1101/gr.107524.110 PubMedCentralPubMedCrossRefGoogle Scholar
  45. Metspalu M, Romero IG, Yunusbayev B et al (2011) Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia. Am J Hum Genet 89:731–744. doi:10.1016/j.ajhg.2011.11.010 PubMedCentralPubMedCrossRefGoogle Scholar
  46. Moorjani P, Thangaraj K, Patterson N et al (2013) Genetic evidence for recent population mixture in India. Am J Hum Genet 93:422–438. doi:10.1016/j.ajhg.2013.07.006 PubMedCentralPubMedCrossRefGoogle Scholar
  47. Negi S, Juyal G, Senapati S et al (2013) A genome-wide association study reveals ARL15, a novel non-HLA susceptibility gene for rheumatoid arthritis in North Indians. Arthritis Rheum 65:3026–3035. doi:10.1002/art.38110 PubMedCrossRefGoogle Scholar
  48. Nelson MR, Bryc K, King KS et al (2008) The population reference sample, POPRES: a resource for population, disease, and pharmacological genetics research. Am J Hum Genet 83:347–358. doi:10.1016/j.ajhg.2008.08.005 PubMedCentralPubMedCrossRefGoogle Scholar
  49. Nelson MR, Wegmann D, Ehm MG et al (2012) An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337:100–104. doi:10.1126/science.1217876
  50. Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12:443–451. doi:10.1038/nrg2986 PubMedCentralPubMedCrossRefGoogle Scholar
  51. Peier AM, Moqrich A, Hergarden AC et al (2002) A TRP channel that senses cold stimuli and menthol. Cell 108:705–715PubMedCrossRefGoogle Scholar
  52. Pickrell JK, Coop G, Novembre J et al (2009) Signals of recent positive selection in a worldwide sample of human populations. Genome Res 19:826–837PubMedCentralPubMedCrossRefGoogle Scholar
  53. Pradhan S, Sengupta M, Dutta A et al (2011) Indian genetic disease database. Nucleic Acids Res 39:D933–D938. doi:10.1093/nar/gkq1025 PubMedCentralPubMedCrossRefGoogle Scholar
  54. Prasad P, Kumar A, Gupta R et al (2012) Caucasian and Asian specific rheumatoid arthritis risk loci reveal limited replication and apparent allelic heterogeneity in north Indians. PLoS One 7:e31584. doi:10.1371/journal.pone.0031584 PubMedCentralPubMedCrossRefGoogle Scholar
  55. Price AL, Patterson NJ, Plenge RM et al (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909PubMedCrossRefGoogle Scholar
  56. Purcell S, Neale B, Todd-Brown K et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575. doi:10.1086/519795 PubMedCentralPubMedCrossRefGoogle Scholar
  57. Qin ZS, Gopalakrishnan S, Abecasis GR (2006) An efficient comprehensive search algorithm for tagSNP selection using linkage disequilibrium criteria. Bioinformatics 22:220–225. doi:10.1093/bioinformatics/bti762 PubMedCrossRefGoogle Scholar
  58. Reich D, Thangaraj K, Patterson N et al (2009) Reconstructing Indian population history. Nature 461:489–494. doi:10.1038/nature08365 PubMedCentralPubMedCrossRefGoogle Scholar
  59. Rosenberg NA, Mahajan S, Gonzalez-Quevedo C et al (2006) Low levels of genetic divergence across geographically and linguistically diverse populations from India. PLoS Genet 2:e215. doi:10.1371/journal.pgen.0020215 PubMedCentralPubMedCrossRefGoogle Scholar
  60. Sabeti PC, Schaffner SF, Fry B et al (2006) Positive natural selection in the human lineage. Science 80(312):1614–1620CrossRefGoogle Scholar
  61. Sabeti PC, Varilly P, Fry B et al (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449:913–918. doi:10.1038/nature06250 PubMedCentralPubMedCrossRefGoogle Scholar
  62. Sironi M, Clerici M (2010) The hygiene hypothesis: an evolutionary perspective. Microbes Infect 12:421–427PubMedCrossRefGoogle Scholar
  63. Tennessen JA, Bigham AW, O’Connor TD et al (2012) Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337:64–69. doi:10.1126/science.1219240
  64. Voight BF, Kudaravalli S, Wen X, Pritchard JK (2006) A map of recent positive selection in the human genome. PLoS Biol 4:e72PubMedCentralPubMedCrossRefGoogle Scholar
  65. Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164. doi:10.1093/nar/gkq603 PubMedCentralPubMedCrossRefGoogle Scholar
  66. Weir BS, Hill WG (2002) Estimating F-statistics. Annu Rev Genet 36:721–750. doi:10.1146/annurev.genet.36.050802.093940 PubMedCrossRefGoogle Scholar
  67. Xing J, Watkins WS, Hu Y et al (2010) Genetic diversity in India and the inference of Eurasian population expansion. Genome Biol 11:R113. doi:10.1186/gb-2010-11-11-r113 PubMedCentralPubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Garima Juyal
    • 1
  • Mayukh Mondal
    • 2
  • Pierre Luisi
    • 2
  • Hafid Laayouni
    • 2
    • 3
  • Ajit Sood
    • 4
  • Vandana Midha
    • 5
  • Peter Heutink
    • 6
  • Jaume Bertranpetit
    • 2
  • B. K. Thelma
    • 1
  • Ferran Casals
    • 2
    • 7
  1. 1.Department of GeneticsUniversity of Delhi South CampusNew DelhiIndia
  2. 2.Institut de Biologia Evolutiva (UPF-CSIC)Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de BarcelonaBarcelonaSpain
  3. 3.Departament de Genètica i de Microbiologia, Grup de Biologia Evolutiva (GBE)Universitat Autonòma de BarcelonaBarcelonaSpain
  4. 4.Department of GastroenterologyDayanand Medical College and HospitalLudhianaIndia
  5. 5.Department of MedicineDayanand Medical College and HospitalLudhianaIndia
  6. 6.Department of Clinical Genetics, Center for Neurogenomics and Cognitive Research-CNCRNeuroscience Campus Amsterdam-NCA, VU Medical CenterAmsterdamThe Netherlands
  7. 7.Genomics Core FacilityDepartament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de BarcelonaBarcelonaSpain

Personalised recommendations