Estimating the Relative Contributions of New Genes from Retrotransposition and Segmental Duplication Events during Mammalian Evolution

  • Jin Jun
  • Paul Ryvkin
  • Edward Hemphill
  • Ion Măndoiu
  • Craig Nelson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5267)


Gene duplication has long been recognized as a major force in genome evolution and has recently been recognized as an important source of individual variation. For many years the origin of functional gene duplicates was assumed to be whole or partial genome duplication events, but recently retrotransposition has also been shown to contribute new functional protein coding genes and siRNA’s. Here we present a method for the identification and classification of retrotransposed and segmentally duplicated genes and pseudogenes based on local synteny. Using the results of this approach we compare the rates of segmental duplication and retrotransposition in five mammalian genomes and estimate the rate of new functional protein coding gene formation by each mechanism. We find that retrotransposition occurs at a much higher and temporally more variable rate than segmental duplication, and gives rise to many more duplicated sequences over time. While the chance that retrotransposed copies become functional is much lower than that of their segmentally duplicated counterparts, the higher rate of retrotransposition events leads to nearly equal contributions of new genes by each mechanism.


Duplication Event Segmental Duplication Primate Branch Internal Branch Ensembl Gene 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)Google Scholar
  2. 2.
    Bailey, J.A., Eichler, E.E.: Primate segmental duplications: crucibles of evolution, diversity and disease. Nat. Rev. Genet. 7(7), 552–564 (2006)CrossRefGoogle Scholar
  3. 3.
    Bailey, J.A., Gu, Z., Clark, R.A., Reinert, K., Samonte, R.V., Schwartz, S., Adams, M.D., Myers, E.W., Li, P.W., Eichler, E.E.: Recent segmental duplications in the human genome. Science 297(5583), 1003–1007 (2002)CrossRefGoogle Scholar
  4. 4.
    International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 431(7011), 931–945 (2004)Google Scholar
  5. 5.
    Demuth, J.P., De Bie, T., Stajich, J.E., Cristianini, N., Hahn, M.W.: The evolution of mammalian gene families. PLoS ONE 1, e85 (2006)CrossRefGoogle Scholar
  6. 6.
    Emerson, J.J., Kaessmann, H., Betran, E., Long, M.: Extensive gene traffic on the mammalian x chromosome. Science 303(5657), 537–540 (2004)CrossRefGoogle Scholar
  7. 7.
    Fortna, A., Kim, Y., MacLaren, E., Marshall, K., Hahn, G., Meltesen, L., Brenton, M., Hink, R., Burgers, S., Hernandez-Boussard, T., Karimpour-Fard, A., Glueck, D., McGavran, L., Berry, R., Pollack, J., Sikela, J.M.: Lineage-specific gene duplication and loss in human and great ape evolution. PLoS Biol. 2(7), E207 (2004)CrossRefGoogle Scholar
  8. 8.
    Hubbard, T., Andrews, D., Caccamo, M., Cameron, G., Chen, Y., Clamp, M., Clarke, L., Coates, G., Cox, T., Cunningham, F., Curwen, V., Cutts, T., Down, T., Durbin, R., Fernandez-Suarez, X.M., Gilbert, J., Hammond, M., Herrero, J., Hotz, H., Howe, K., Iyer, V., Jekosch, K., Kahari, A., Kasprzyk, A., Keefe, D., Keenan, S., Kokocinsci, F., London, D., Longden, I., McVicker, G., Melsopp, C., Meidl, P., Potter, S., Proctor, G., Rae, M., Rios, D., Schuster, M., Searle, S., Severin, J., Slater, G., Smedley, D., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Storey, R., Trevanion, S., Ureta-Vidal, A., Vogel, J., White, S., Woodwark, C., Birney, E.: Ensembl 2005. Nucleic Acids Res. 33, 447–453 (2005)CrossRefGoogle Scholar
  9. 9.
    Hurley, I., Hale, M.E., Prince, V.E.: Duplication events and the evolution of segmental identity. Evol. Dev. 7(6), 556–567 (2005)CrossRefGoogle Scholar
  10. 10.
    Huynen, M.A., Bork, P.: Measuring genome evolution. Proc. Natl. Acad. Sci. USA 95(11), 5849–5856 (1998)CrossRefGoogle Scholar
  11. 11.
    Marques, A.C., Dupanloup, I., Vinckenbosch, N., Reymond, A., Kaessmann, H.: Emergence of young human genes after a burst of retroposition in primates. PLoS Biol. 3(11), e357 (2005)CrossRefGoogle Scholar
  12. 12.
    Mills, R.E., Bennett, E.A., Iskow, R.C., Devine, S.E.: Which transposable elements are active in the human genome? Trends Genet. 23(4), 183–191 (2007)CrossRefGoogle Scholar
  13. 13.
    Nekrutenko, A., Makova, K.D., Li, W.H.: The k(a)/k(s) ratio test for assessing the protein-coding potential of genomic regions: an empirical and simulation study. Genome Res. 12(1), 198–202 (2002)CrossRefGoogle Scholar
  14. 14.
    Ohno, S.: Evolution by gene duplication. Allen and Unwin, London (1970)Google Scholar
  15. 15.
    Ohshima, K., Hattori, M., Yada, T., Gojobori, T., Sakaki, Y., Okada, N.: Whole-genome screening indicates a possible burst of formation of processed pseudogenes and alu repeats by particular l1 subfamilies in ancestral primates. Genome Biol. 4(11), R74 (2003)CrossRefGoogle Scholar
  16. 16.
    Petrov, D.A., Hartl, D.L.: Patterns of nucleotide substitution in drosophila and mammalian genomes. Proc. Natl. Acad. Sci. USA 96(4), 1475–1479 (1999)CrossRefGoogle Scholar
  17. 17.
    Potrzebowski, L., Vinckenbosch, N., Marques, A.C., Chalme, F., Jègou, B., Kaessmann, H.: Chromosomal gene movements reflect the recent origin and biology of therian sex chromosomes. PLoS Biol. 6(4), e80 (2008)CrossRefGoogle Scholar
  18. 18.
    Redon, R., Ishikawa, S., Fitch, K.R., Feuk, L., Perry, G.H., Andrews, T.D., Fiegler, H., Shapero, M.H., Carson, A.R., Chen, W., Cho, E.K., Dallaire, S., Freeman, J.L., Gonzalez, J.R., Gratacos, M., Huang, J., Kalaitzopoulos, D., Komura, D., MacDonald, J.R., Marshall, C.R., Mei, R., Montgomery, L., Nishimura, K., Okamura, K., Shen, F., Somerville, M.J., Tchinda, J., Valsesia, A., Woodwark, C., Yang, F., Zhang, J., Zerjal, T., Armengol, L., Conrad, D.F., Estivill, X., Tyler-Smith, C., Carter, N.P., Aburatani, H., Lee, C., Jones, K.W., Scherer, S.W., Hurles, M.E.: Global variation in copy number in the human genome. Nature 444(7118), 444–454 (2006)CrossRefGoogle Scholar
  19. 19.
    Rocha, E.P.: Inference and analysis of the relative stability of bacterial chromosomes. Mol. Biol. Evol. 23(3), 513–522 (2006)CrossRefGoogle Scholar
  20. 20.
    Rogozin, I.B., Wolf, Y.I., Sorokin, A.V., Mirkin, B.G., Koonin, E.V.: Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr. Biol. 13(17), 1512–1517 (2003)CrossRefGoogle Scholar
  21. 21.
    Sakai, H., Koyanagi, K.O., Imanishi, T., Itoh, T., Gojobori, T.: Frequent emergence and functional resurrection of processed pseudogenes in the human and mouse genomes. Gene. 389(2), 196–203 (2007)CrossRefGoogle Scholar
  22. 22.
    She, X., Cheng, Z., Zollner, S., Church, D.M., Eichler, E.E.: Mouse segmental duplication and copy number variation. Nat. Genet. (2008)Google Scholar
  23. 23.
    She, X., Jiang, Z., Clark, R.A., Liu, G., Cheng, Z., Tuzun, E., Church, D.M., Sutton, G., Halpern, A.L., Eichler, E.E.: Shotgun sequence assembly and recent segmental duplications within the human genome. Nature 431(7011), 927–930 (2004)CrossRefGoogle Scholar
  24. 24.
    She, X., Liu, G., Ventura, M., Zhao, S., Misceo, D., Roberto, R., Cardone, M.F., Rocchi, M., Green, E.D., Archidiacano, N., Eichler, E.E.: A preliminary comparative analysis of primate segmental duplications shows elevated substitution rates and a great-ape expansion of intrachromosomal duplications. Genome Res. 16(5), 576–583 (2006)CrossRefGoogle Scholar
  25. 25.
    Shemesh, R., Novik, A., Edelheit, S., Sorek, R.: Genomic fossils as a snapshot of the human transcriptome. Proc. Natl. Acad. Sci. USA 103(5), 1364–1369 (2006)CrossRefGoogle Scholar
  26. 26.
    Sneath, P.H.A., Sokal, R.R.: Numerical Taxonomy. W.H. Freeman and Company, San Francisco (1973)zbMATHGoogle Scholar
  27. 27.
    Svensson, O., Arvestad, L., Lagergren, J.: Genome-wide survey for biologically functional pseudogenes. PLoS Comput. Biol. 2(5), e46 (2006)CrossRefGoogle Scholar
  28. 28.
    Tam, O.H., Aravin, A.A., Stein, P., Girard, A., Murchison, E.P., Cheloufi, S., Hodges, E., Anger, M., Sachidanandam, R., Schultz, R.M., Hannon, G.J.: Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes. Nature 453(7194), 534–538 (2008)CrossRefGoogle Scholar
  29. 29.
    Torrents, D., Suyama, M., Zdobnov, E., Bork, P.: A genome-wide survey of human pseudogenes. Genome Res. 13(12), 2559–2567 (2003)CrossRefGoogle Scholar
  30. 30.
    Ureta-Vidal, A., Ettwiller, L., Birney, E.: Comparative genomics: genome-wide analysis in metazoan eukaryotes. Nat. Rev. Genet. 4(4), 251–262 (2003)CrossRefGoogle Scholar
  31. 31.
    Van de Peer, Y., Taylor, J.S., Meyer, A.: Are all fishes ancient polyploids? J. Struct. Funct. Genomics 3(1-4), 65–73 (2003)CrossRefGoogle Scholar
  32. 32.
    Vinckenbosch, N., Dupanloup, I., Kaessmann, H.: Evolutionary fate of retroposed gene copies in the human genome. Proc. Natl. Acad. Sci. USA 103(9), 3220–3225 (2006)CrossRefGoogle Scholar
  33. 33.
    Watanabe, T., Totoki, Y., Toyoda, A., Kaneda, M., Kuramochi-Miyagawa, S., Obata, Y., Chiba, H., Kohara, Y., Kono, T., Nakano, T., Surani, M.A., Sakaki, Y., Sasaki, H.: Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes. Nature 453(7194), 539–543 (2008)CrossRefGoogle Scholar
  34. 34.
    Yang, Z.: Paml: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13(5), 555–556 (1997)Google Scholar
  35. 35.
    Zhang, Z., Carriero, N., Gerstein, M.: Comparative analysis of processed pseudogenes in the mouse and human genomes. Trends Genet. 20(2), 62–67 (2004)CrossRefGoogle Scholar
  36. 36.
    Zhang, Z., Carriero, N., Zheng, D., Karro, J., Harrison, P.M., Gerstein, M.: Pseudopipe: an automated pseudogene identification pipeline. Bioinformatics 22(12), 1437–1439 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jin Jun
    • 1
  • Paul Ryvkin
    • 2
  • Edward Hemphill
    • 3
  • Ion Măndoiu
    • 1
  • Craig Nelson
    • 3
  1. 1.Computer Science & Engineering DepartmentUniversity of ConnecticutStorrsUSA
  2. 2.Genomics & Computational Biology Graduate GroupUniversity of PennsylvaniaPhiladelphiaUSA
  3. 3.Genetics & Genomics Program, Department of Molecular & Cell BiologyUniversity of ConnecticutStorrsUSA

Personalised recommendations