Human Genetics

, Volume 131, Issue 5, pp 757–771 | Cite as

Polymorphic NumtS trace human population relationships

  • Martin Lang
  • Marco Sazzini
  • Francesco Maria Calabrese
  • Domenico Simone
  • Alessio Boattini
  • Giovanni Romeo
  • Donata Luiselli
  • Marcella Attimonelli
  • Giuseppe Gasparre
Original Investigation


The human genome is constantly subjected to evolutionary forces which shape its architecture. Insertions of mitochondrial DNA sequences into nuclear genome (NumtS) have been described in several eukaryotic species, including Homo sapiens and other primates. The ongoing process of the generation of NumtS has made them valuable markers in primate phylogenetic studies, as well as potentially informative loci for reconstructing the genetic history of modern humans. Here, we report the identification of 53 human-specific NumtS by inspection of the UCSC genome browser, showing that they may be direct insertions of mitochondrial DNA into the human nuclear DNA after the human-chimpanzee split. In silico analyses allowed us to identify 14 NumtS which are polymorphic in terms of their presence/absence within the human genome in individuals of different ancestry. The allele frequencies of these polymorphic NumtS were calculated for 1000 Genomes Project sequence data from 13 populations worldwide, and principal components analysis and hierarchical clustering methods allowed the detection of strong signals of geographical structure related to the genetic diversity of these loci. All identified polymorphic human-specific NumtS together with a tandemly duplicated NumtS have also been validated by PCR amplification on a panel of 60 samples belonging to five native populations worldwide, confirming the expected NumtS variability. On the basis of these findings, we have succeeded in depicting the landscape of variation of a series of NumtS in several ethnic groups, making an advance in their identification as useful markers in the study on human population genetics.


Segmental Duplication Chimpanzee Genome Continental Group Genome Project Data Validation Panel 



This study was partly supported by the Italian Ministry of University and Research (MIUR) grant FIRB ‘Futuro in Ricerca’ J31J10000040001 to G.G. and contributions from Prof. Herawati Sudoyo of the Eijkman Institute of Molecular Biology, Jakarta (Indonesia) to G.G. and from the “Fondo di Ateneo” (University of Bari) to M.A.

Supplementary material

439_2011_1125_MOESM1_ESM.doc (1.2 mb)
Supplementary material 1 (DOC 1191 kb)
439_2011_1125_MOESM2_ESM.xls (126 kb)
Supplementary material 2 (XLS 125 kb)


  1. Albers CA, Lunter G, MacArthur DG, McVean G, Ouwehand WH, Durbin R (2010) Dindel: accurate indel calls from short-read data. Genome Res 21(6):961–973PubMedCrossRefGoogle Scholar
  2. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402PubMedCrossRefGoogle Scholar
  3. Auton A, Bryc K, Boyko AR, Lohmueller KE, Novembre J, Reynolds A, Indap A, Wright MH, Degenhardt JD, Gutenkunst RN, King KS, Nelson MR, Bustamante CD (2009) Global distribution of genomic diversity underscores rich complex history of continental human populations. Genome Res 19(5):795–803PubMedCrossRefGoogle Scholar
  4. Baldo L, de Queiroz A, Hedin M, Hayashi CY, Gatesy J (2011) Nuclear-mitochondrial sequences as witnesses of past interbreeding and population diversity in the jumping bristletail Mesomachilis. Mol Biol Evol 28(1):195–210PubMedCrossRefGoogle Scholar
  5. Bastos-Rodrigues L, Pimenta JR, Pena SD (2006) The genetic structure of human populations studied through short insertion-deletion polymorphisms. Ann Hum Genet 70(Pt 5):658–665PubMedCrossRefGoogle Scholar
  6. Bensasson D, Feldman MW, Petrov DA (2003) Rates of DNA duplication and mitochondrial DNA insertion in the human genome. J Mol Evol 57(3):343–354PubMedCrossRefGoogle Scholar
  7. Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, Mangan M, Nekrutenko A, Taylor J (2010) Galaxy: a web-based genome analysis tool for experimentalists. Curr Protoc Mol Biol Chapter 19:Unit 19 10 11-21Google Scholar
  8. Briggs AW, Good JM, Green RE, Krause J, Maricic T, Stenzel U, Lalueza-Fox C, Rudan P, Brajkovic D, Kucan Z, Gusic I, Schmitz R, Doronichev VB, Golovanova LV, de la Rasilla M et al (2009a) Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science 325(5938):318–321PubMedCrossRefGoogle Scholar
  9. Briggs AW, Stenzel U, Meyer M, Krause J, Kircher M, Paabo S (2009b) Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res 38(6):e87PubMedCrossRefGoogle Scholar
  10. Bryc K, Velez C, Karafet T, Moreno-Estrada A, Reynolds A, Auton A, Hammer M, Bustamante CD, Ostrer H (2010) Colloquium paper: genome-wide patterns of population structure and admixture among Hispanic/Latino populations. Proc Natl Acad Sci USA 107(2):8954–8961PubMedCrossRefGoogle Scholar
  11. Chen JM, Chuzhanova N, Stenson PD, Ferec C, Cooper DN (2005) Meta-analysis of gross insertions causing human genetic disease: novel mutational mechanisms and the role of replication slippage. Hum Mutat 25(2):207–221PubMedCrossRefGoogle Scholar
  12. Cockerham CC, Weir BS (1984) Covariances of relatives stemming from a population undergoing mixed self and random mating. Biometrics 40(1):157–164PubMedCrossRefGoogle Scholar
  13. Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, Absher D, Myers RM, Cavalli-Sforza LL, Feldman MW, Pritchard JK (2009) The role of geography in human adaptation. PLoS Genet 5(6):e1000500PubMedCrossRefGoogle Scholar
  14. Excoffier L, Lischer HE (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour 10(3):564–567PubMedCrossRefGoogle Scholar
  15. Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479–491Google Scholar
  16. Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164(4):1567–1587PubMedGoogle Scholar
  17. Gherman A, Chen PE, Teslovich TM, Stankiewicz P, Withers M, Kashuk CS, Chakravarti A, Lupski JR, Cutler DJ, Katsanis N (2007) Population bottlenecks as a potential major shaping force of human genome architecture. PLoS Genet 3(7):e119PubMedCrossRefGoogle Scholar
  18. Goecks J, Nekrutenko A, Taylor J (2010) Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 11(8):R86PubMedCrossRefGoogle Scholar
  19. Goldin E, Stahl S, Cooney AM, Kaneski CR, Gupta S, Brady RO, Ellis JR, Schiffmann R (2004) Transfer of a mitochondrial DNA fragment to MCOLN1 causes an inherited case of mucolipidosis IV. Hum Mutat 24(6):460–465PubMedCrossRefGoogle Scholar
  20. Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH, Hansen NF, Durand EY, Malaspinas AS, Jensen JD, Marques-Bonet T et al (2010) A draft sequence of the Neandertal genome. Science 328(5979):710–722PubMedCrossRefGoogle Scholar
  21. Handley LJ, Manica A, Goudet J, Balloux F (2007) Going the distance: human population genetics in a clinal world. Trends Genet 23(9):432–439PubMedCrossRefGoogle Scholar
  22. Hao K, Chudin E, Greenawalt D, Schadt EE (2010) Magnitude of stratification in human populations and impacts on genome wide association studies. PLoS One 5(1):e8695PubMedCrossRefGoogle Scholar
  23. Hazkani-Covo E (2009) Mitochondrial insertions into primate nuclear genomes suggest the use of numts as a tool for phylogeny. Mol Biol Evol 26(10):2175–2179PubMedCrossRefGoogle Scholar
  24. Hazkani-Covo E, Covo S (2008) Numt-mediated double-strand break repair mitigates deletions during primate genome evolution. PLoS Genet 4(10):e1000237PubMedCrossRefGoogle Scholar
  25. Hazkani-Covo E, Graur D (2007) A comparative analysis of numt evolution in human and chimpanzee. Mol Biol Evol 24(1):13–18PubMedCrossRefGoogle Scholar
  26. Hazkani-Covo E, Zeller RM, Martin W (2010) Molecular poltergeists: mitochondrial DNA copies (numts) in sequenced nuclear genomes. PLoS Genet 6(2):e1000834PubMedCrossRefGoogle Scholar
  27. Hou Y, Lin S (2009) Distinct gene number-genome size relationships for eukaryotes and non-eukaryotes: gene content estimation for dinoflagellate genomes. Plos One 4:e6978PubMedCrossRefGoogle Scholar
  28. Itsara A, Cooper GM, Baker C, Girirajan S, Li J, Absher D, Krauss RM, Myers RM, Ridker PM, Chasman DI, Mefford H, Ying P, Nickerson DA, Eichler EE (2009) Population analysis of large copy number variants and hotspots of human genetic disease. Am J Hum Genet 84(2):148–161PubMedCrossRefGoogle Scholar
  29. Jakobsson M, Scholz SW, Scheet P, Gibbs JR, VanLiere JM, Fung HC, Szpiech ZA, Degnan JH, Wang K, Guerreiro R, Bras JM, Schymick JC, Hernandez DG, Traynor BJ, Simon-Sanchez J et al (2008) Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451(7181):998–1003PubMedCrossRefGoogle Scholar
  30. Jensen-Seaman MI, Wildschutte JH, Soto-Calderon ID, Anthony NM (2009) A comparative approach shows differences in patterns of numt insertion during hominoid evolution. J Mol Evol 68(6):688–699PubMedCrossRefGoogle Scholar
  31. Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ (2004) The UCSC Table Browser data retrieval tool. Nucleic Acids Res 32 (Database issue):D493–D496Google Scholar
  32. Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12(4):656–664PubMedGoogle Scholar
  33. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The human genome browser at UCSC. Genome Res 12(6):996–1006PubMedGoogle Scholar
  34. Kersbergen P, van Duijn K, Kloosterman AD, den Dunnen JT, Kayser M, de Knijff P (2009) Developing a set of ancestry-sensitive DNA markers reflecting continental origins of humans. BMC Genet 10:69PubMedCrossRefGoogle Scholar
  35. Lascaro D, Castellana S, Gasparre G, Romeo G, Saccone C, Attimonelli M (2008) The RHNumtS compilation: features and bioinformatics approaches to locate and quantify Human NumtS. BMC Genomics 9:267PubMedCrossRefGoogle Scholar
  36. Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, Myers RM (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319(5866):1100–1104PubMedCrossRefGoogle Scholar
  37. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079Google Scholar
  38. Lopez Herraez D, Bauchet M, Tang K, Theunert C, Pugach I, Li J, Nandineni MR, Gross A, Scholz M, Stoneking M (2009) Genetic variation and recent positive selection in worldwide human populations: evidence from nearly 1 million SNPs. PLoS One 4(11):e7888PubMedCrossRefGoogle Scholar
  39. Mishmar D, Ruiz-Pesini E, Brandon M, Wallace DC (2004) Mitochondrial DNA-like sequences in the nucleus (NUMTs): insights into our African origins and the mechanism of foreign DNA integration. Hum Mutat 23(2):125–133PubMedCrossRefGoogle Scholar
  40. Novembre J, Johnson T, Bryc K, Kutalik Z, Boyko AR, Auton A, Indap A, King KS, Bergmann S, Nelson MR, Stephens M, Bustamante CD (2008) Genes mirror geography within Europe. Nature 456(7218):98–101PubMedCrossRefGoogle Scholar
  41. Ovchinnikov IV, Kholina OI (2010) Genome digging: insight into the mitochondrial genome of Homo. PLoS One 5(12):e14278PubMedCrossRefGoogle Scholar
  42. Pakendorf B, Stoneking M (2005) Mitochondrial DNA and human evolution. Annu Rev Genomics Hum Genet 6:165–183PubMedCrossRefGoogle Scholar
  43. Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2:559–572Google Scholar
  44. Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, Cavalli-Sforza LL (2005) Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc Natl Acad Sci USA 102:15942–15947Google Scholar
  45. Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, Viola B, Briggs AW, Stenzel U, Johnson PL, Maricic T, Good JM, Marques-Bonet T, Alkan C, Fu Q et al (2010) Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468(7327):1053–1060PubMedCrossRefGoogle Scholar
  46. Reynolds J, Weir BS, Cockerham CC (1983) Estimation of the coancestry coefficient: basis for a short-term genetic distance. Genetics 105(3):767–779PubMedGoogle Scholar
  47. Ricchetti M, Fairhead C, Dujon B (1999) Mitochondrial DNA repairs double-strand breaks in yeast chromosomes. Nature 402(6757):96–100PubMedCrossRefGoogle Scholar
  48. Ricchetti M, Tekaia F, Dujon B (2004) Continued colonization of the human genome by mitochondrial DNA. PLoS Biol 2(9):E273PubMedCrossRefGoogle Scholar
  49. Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26Google Scholar
  50. Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW (2002) Genetic structure of human populations. Science 298(5602):2381–2385PubMedCrossRefGoogle Scholar
  51. Rubino F, Piredda R, Calabrese FM, Simone D, Lang M, Calabrese C, Petruzzella V, Tommaseo-Ponzetta M, Gasparre G, Attimonelli M (2011) HmtDB, a genomic resource for mitochondrion-based human variabilità studies. Nucleic Acids Res. doi: 10.1093/nar/gkr1086
  52. Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425PubMedGoogle Scholar
  53. Schmitz J, Piskurek O, Zischler H (2005) Forty million years of independent evolution: a mitochondrial gene and its corresponding nuclear pseudogene in primates. J Mol Evol 61(1):1–11PubMedCrossRefGoogle Scholar
  54. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29(1):308–311PubMedCrossRefGoogle Scholar
  55. Simone D, Calabrese FM, Lang M, Gasparre G, Attimonelli M (2011) The reference human nuclear mitochondrial sequences compilation validated and implemented on the UCSC genome browser. BMC Genomics 12(1):517PubMedCrossRefGoogle Scholar
  56. Stoneking M (2008) Human origins. The molecular perspective. EMBO Rep 9 (Suppl 1):S46–S50Google Scholar
  57. Suzuki R, Shimodaira H (2006) Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics 22:1540–1542 Google Scholar
  58. The 1000 Genomes Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467 (7319):1061–1073Google Scholar
  59. Thomas R, Zischler H, Paabo S, Stoneking M (1996) Novel mitochondrial DNA insertion polymorphism and its usefulness for human population studies. Hum Biol 68(6):847–854PubMedGoogle Scholar
  60. Tishkoff SA, Verrelli BC (2003) Patterns of human genetic diversity: implications for human evolutionary history and disease. Annu Rev Genomics Hum Genet 4:293–340PubMedCrossRefGoogle Scholar
  61. Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A, Froment A, Hirbo JB, Awomoyi AA, Bodo JM, Doumbo O, Ibrahim M, Juma AT, Kotze MJ, Lema G, Moore JH et al (2009) The genetic structure and history of Africans and African Americans. Science 324(5930):1035–1044PubMedCrossRefGoogle Scholar
  62. Tourmen Y, Baris O, Dessen P, Jacques C, Malthiery Y, Reynier P (2002) Structure and chromosomal distribution of human mitochondrial pseudogenes. Genomics 80(1):71–77PubMedCrossRefGoogle Scholar
  63. Turner C, Killoran C, Thomas NS, Rosenberg M, Chuzhanova NA, Johnston J, Kemel Y, Cooper DN, Biesecker LG (2003) Human genetic disease caused by de novo mitochondrial-nuclear DNA transfer. Hum Genet 112(3):303–309PubMedGoogle Scholar
  64. Venkatesh B, Dandona N, Brenner S (2006) Fugu genome does not contain mitochondrial pseudogenes. Genomics 87(2):307–310PubMedCrossRefGoogle Scholar
  65. Willett-Brozick JE, Savul SA, Richey LE, Baysal BE (2001) Germ line insertion of mtDNA at the breakpoint junction of a reciprocal constitutional translocation. Hum Genet 109(2):216–223PubMedCrossRefGoogle Scholar
  66. Zischler H (2000) Nuclear integrations of mitochondrial DNA in primates: inference of associated mutational events. Electrophoresis 21(3):531–536PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Martin Lang
    • 1
  • Marco Sazzini
    • 2
  • Francesco Maria Calabrese
    • 3
  • Domenico Simone
    • 3
  • Alessio Boattini
    • 2
  • Giovanni Romeo
    • 1
  • Donata Luiselli
    • 2
  • Marcella Attimonelli
    • 3
  • Giuseppe Gasparre
    • 1
  1. 1.Dipartimento di Scienze Ginecologiche, Ostetriche e Pediatriche, U.O. Genetica Medica, Pad.11, Pol.S.Orsola-MalpighiUniversità di BolognaBolognaItaly
  2. 2.Dipartimento di Biologia Evoluzionistica Sperimentale, Laboratorio di Antropologia MolecolareUniversità di BolognaBolognaItaly
  3. 3.Dipartimento di Biochimica e Biologia Molecolare “E. Quagliariello”Università di BariBariItaly

Personalised recommendations