Journal of Molecular Evolution

, Volume 69, Issue 1, pp 81–93

Evolutionary Dynamics of Recently Duplicated Genes: Selective Constraints on Diverging Paralogs in the Drosophila pseudoobscura Genome



Duplicated genes produce genetic variation that can influence the evolution of genomes and phenotypes. In most cases, for a duplicated gene to contribute to evolutionary novelty it must survive the early stages of divergence from its paralog without becoming a pseudogene. I examined the evolutionary dynamics of recently duplicated genes in the Drosophila pseudoobscura genome to understand the factors affecting these early stages of evolution. Paralogs located in closer proximity have higher sequence identity. This suggests that gene conversion occurs more often between duplications in close proximity or that there is more genetic independence between distant paralogs. Partially duplicated genes have a higher likelihood of pseudogenization than completely duplicated genes, but no single factor significantly contributes to the selective constraints on a completely duplicated gene. However, DNA-based duplications and duplications within chromosome arms tend to produce longer duplication tracts than retroposed and inter-arm duplications, and longer duplication tracts are more likely to contain a completely duplicated gene. Therefore, the relative position of paralogs and the mechanism of duplication indirectly affect whether a duplicated gene is retained or pseudogenized.


Drosophila Gene duplication Pseudogene Copy number polymorphism 

Supplementary material

239_2009_9254_MOESM1_ESM.pdf (303 kb)
Supplementary Figure S1Relative rates of amino acid evolution for ancestral and derived paralogs. Estimated amino acid substitutions along the ancestral (dark gray) and derived (white) lineages are graphed for each duplicated gene in the dataset; each pair of bars represents the ancestral and derived copy of a duplicated gene, respectively. Paralogs for which the ancestral and derived copy cannot be distinguished are indicated by two gray bars. Paralogs for which the derived copy contains (A) a partial coding sequence and (B) a complete coding sequence are shown, and the status of the open reading frame of the derived copy is indicated below the X-axis. A relative rate test based on a chi-square test was used to determine if the difference in amino acid substitutions in the ancestral and derived copy of each duplicated gene is significant (Tajima 1993). Paralogs for which the test could not be performed (because of too few substitutions between the paralogs) are indicated by the black bars labeled “N/A”. Paralogs for which the test is significant at P < 0.05 are indicated by a single asterisk and paralogs for which the test is significant at P < 0.005 are indicated by two asterisks. Two completely duplicated genes in which the derived copy is evolving significantly faster than the ancestral copy are labeled. Supplementary material 1 (PDF 304 kb)
239_2009_9254_MOESM2_ESM.pdf (271 kb)
Supplementary Figure S2Tissue expression of D. melanogaster orthologs of completely duplicated genes. Tissue expression data were retrieved for the D. melanogaster orthologs of each completely duplicated gene in the D. pseudoobscura genome (Chintapalli et al. 2007). For each tissue, the number of non-degenerated and degenerated genes expressed and not expressed in that tissue is graphed. A single asterisk indicates a significant departure from independence between degeneration and expression using a G test with P < 0.05. Supplementary material 2 (PDF 272 kb)
239_2009_9254_MOESM3_ESM.pdf (27 kb)
Supplementary material 3 (PDF 27 kb)
239_2009_9254_MOESM4_ESM.xls (16 kb)
Supplementary Table S1Copy number polymorphism primers. Supplementary material 4 (XLS 17 kb)
239_2009_9254_MOESM5_ESM.xls (66 kb)
Supplementary Table S2Data on each duplicated gene. Supplementary material 5 (XLS 67 kb) (198 kb)
Supplementary Data-Annotated alignments of duplicated genes. Alignments are in the MEGA format, with coding and non-coding sequences annotated. Ancestral copies are indicated by “anc” in the sequence name, and derived copies are indicated by “dup” in the sequence name. For duplicated genes where the ancestral and derived copies could not be determined, the two copies are named “copyA” and “copyB”. Coding sequences were reverse complemented in some alignments (relative to the rest of the aligned sequence) containing multiple genes to keep open reading frames in proper orientation; this was done if the coding sequences were in opposite orientations. The following genes were reverse complemented relative to the rest of the aligned sequence: CG15287, CG14860,CG8016, CG8589, CG13190, CG13063, CG11070, CG16734, CG16983. Supplementary material 6 (ZIP 198 kb)


  1. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, George RA, Lewis SE, Richards S, Ashburner M, Henderson SN, Sutton GG, Wortman JR, Yandell MD, Zhang Q, Chen LX, Brandon RC, Rogers Y-HC, Blazej RG, Champe M, Pfeiffer BD, Wan KH, Doyle C, Baxter EG, Helt G, Nelson CR, Gabor Miklos GL, Abril JF, Agbayani A, An H-J, Andrews-Pfannkoch C, Baldwin D, Ballew RM, Basu A, Baxendale J, Bayraktaroglu L, Beasley EM, Beeson KY, Benos PV, Berman BP, Bhandari D, Bolshakov S, Borkova D, Botchan MR, Bouck J, Brokstein P, Brottier P, Burtis KC, Busam DA, Butler H, Cadieu E, Center A, Chandra I, Cherry JM, Cawley S, Dahlke C, Davenport LB, Davies P, Bd Pablos, Delcher A, Deng Z, Mays AD, Dew I, Dietz SM, Dodson K, Doup LE, Downes M, Dugan-Rocha S, Dunkov BC, Dunn P, Durbin KJ, Evangelista CC, Ferraz C, Ferriera S, Fleischmann W, Fosler C, Gabrielian AE, Garg NS, Gelbart WM, Glasser K, Glodek A, Gong F, Gorrell JH, Gu Z, Guan P, Harris M, Harris NL, Harvey D, Heiman TJ, Hernandez JR, Houck J, Hostin D, Houston KA, Howland TJ, Wei M-H, Ibegwam C et al (2000) The genome sequence of Drosophila melanogaster. Science 287:2185–2195PubMedCrossRefGoogle Scholar
  2. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  3. Arguello JR, Chen Y, Yang S, Wang W, Long M (2006) Origination of an X-linked testes chimeric gene by illegitimate recombination in Drosophila. PLoS Genet 2:e77PubMedCrossRefGoogle Scholar
  4. Arguello JR, Fan C, Wang W, Long M (2007) Origination of chimeric genes through DNA-level recombination. Genome Dyn 3:131–146PubMedCrossRefGoogle Scholar
  5. Benovoy D, Drouin G (2009) Ectopic gene conversions in the human genome. Genomics 93:27–32PubMedCrossRefGoogle Scholar
  6. Bhutkar A, Russo SM, Smith TF, Gelbart WM (2007) Genome-scale analysis of positionally relocated genes. Genome Res 17:1880–1887PubMedCrossRefGoogle Scholar
  7. Burge C, Karlin S (1997) Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94PubMedCrossRefGoogle Scholar
  8. Byrne KP, Wolfe KH (2007) Consistent patterns of rate asymmetry and gene loss indicate widespread neofunctionalization of yeast genes after whole-genome duplication. Genetics 175:1341–1350PubMedCrossRefGoogle Scholar
  9. Chintapalli VR, Wang J, Dow JAT (2007) Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nat Genet 39:715–720PubMedCrossRefGoogle Scholar
  10. Coulombe-Huntington J, Majewski J (2007) Characterization of intron loss events in mammals. Genome Res 17:23–32PubMedCrossRefGoogle Scholar
  11. Cusack BP, Wolfe KH (2007) Not born equal: increased rate asymmetry in relocated and retrotransposed rodent gene duplicates. Mol Biol Evol 24:679–686PubMedCrossRefGoogle Scholar
  12. Demuth JP, Bie TD, Stajich JE, Cristianini N, Hahn MW (2006) The evolution of mammalian gene families. PLoS ONE 1:e85PubMedCrossRefGoogle Scholar
  13. Dopman EB, Hartl DL (2007) A portrait of copy-number polymorphism in Drosophila melanogaster. Proc Natl Acad Sci USA 104:19920–19925PubMedCrossRefGoogle Scholar
  14. Drosophila 12 Genomes Consortium (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218CrossRefGoogle Scholar
  15. Drouin G (2002) Characterization of the gene conversions between the multigene family members of the yeast genome. J Mol Evol 55:14–23PubMedCrossRefGoogle Scholar
  16. Emerson JJ, Cardoso-Moreira M, Borevitz JO, Long M (2008) Natural selection shapes genome-wide patterns of copy-number polymorphism in Drosophila melanogaster. Science 320:1629–1631PubMedCrossRefGoogle Scholar
  17. Fink GR (1987) Pseudogenes in yeast? Cell 49:5–6PubMedCrossRefGoogle Scholar
  18. Force A, Lynch M, Pickett FB, Amores A, Yan Y-l, Postlethwait J (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531–1545PubMedGoogle Scholar
  19. Gotea V, Veeramachaneni V, Makalowski W (2003) Mastering seeds for genomic size nucleotide BLAST searches. Nucl Acids Res 31:6935–6941PubMedCrossRefGoogle Scholar
  20. Haber JE, Leung WY, Borts RH, Lichten M (1991) The frequency of meiotic recombination in yeast is independent of the number and position of homologous donor sequences: implications for chromosome pairing. Proc Natl Acad Sci USA 88:1120–1124PubMedCrossRefGoogle Scholar
  21. Hahn MW, Han MV, Han S-G (2007) Gene family evolution across 12 Drosophila genomes. PLoS Genet 3:e197PubMedCrossRefGoogle Scholar
  22. Harrison PM, Gerstein M (2002) Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol 318:1155–1174PubMedCrossRefGoogle Scholar
  23. Harrison PM, Milburn D, Zhang Z, Bertone P, Gerstein M (2003) Identification of pseudogenes in the Drosophila melanogaster genome. Nucl Acids Res 31:1033–1037PubMedCrossRefGoogle Scholar
  24. He X, Zhang J (2005) Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169:1157–1164PubMedCrossRefGoogle Scholar
  25. Heger A, Ponting CP (2007) Evolutionary rate analyses of orthologs and paralogs from 12 Drosophila genomes. Genome Res 17:1837–1849PubMedCrossRefGoogle Scholar
  26. Hughes AL (1994) The evolution of functionally novel proteins after gene duplication. Proc R Soc Lond B Biol Sci 256:119–124CrossRefGoogle Scholar
  27. Jones CD, Begun DJ (2005) Parallel evolution of chimeric fusion genes. Proc Natl Acad Sci USA 102:11373–11378PubMedCrossRefGoogle Scholar
  28. Katju V, Lynch M (2003) The structure and early evolution of recently arisen gene duplicates in the Caenorhabditis elegans genome. Genetics 165:1793–1803PubMedGoogle Scholar
  29. Kondrashov F, Rogozin I, Wolf Y, Koonin E (2002) Selection in the evolution of gene duplications. Genome Biol 3:0008.1–0008.9CrossRefGoogle Scholar
  30. Krimbas C, Powell J (2000) Inversion polymorphisms in Drosophila. In: Singh RS, Krimbas CB (eds) Evolutionary genetics: from molecules to morphology. Cambridge University Press,  Cambridge, pp 284–299Google Scholar
  31. Kumar S, Tamura K, Nei M (2004) MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5:150–163PubMedCrossRefGoogle Scholar
  32. Lazzaro BP, Clark AG (2001) Evidence for recurrent paralogous gene conversion and exceptional allelic divergence in the attacin genes of Drosophila melanogaster. Genetics 159:659–671PubMedGoogle Scholar
  33. Lin Y-S, Byrnes JK, Hwang J-K, Li W-H (2006) Codon-usage bias versus gene conversion in the evolution of yeast duplicate genes. Proc Natl Acad Sci USA 103:14412–14416PubMedCrossRefGoogle Scholar
  34. Long M, Langley CH (1993) Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. Science 260:91–95PubMedCrossRefGoogle Scholar
  35. Long M, Thornton K (2001) Gene duplication and evolution. Science 293:1551aCrossRefGoogle Scholar
  36. Long M, Betran E, Thornton K, Wang W (2003) The origin of new genes: glimpses from the young and old. Nat Rev Genet 4:865–875PubMedCrossRefGoogle Scholar
  37. Lynch M, Conery JS (2000) The evolutionary fate and consequences of duplicate genes. Science 290:1151–1155PubMedCrossRefGoogle Scholar
  38. Lynch M, Katju V (2004) The altered evolutionary trajectories of gene duplicates. Trends Genet 20:544–549PubMedCrossRefGoogle Scholar
  39. Lynch M, O’Hely M, Walsh B, Force A (2001) The probability of preservation of a newly arisen gene duplicate. Genetics 159:1789–1804PubMedGoogle Scholar
  40. Meisel RP (2009) Repeat mediated gene duplication in the Drosophila pseudoobscura genome. Gene 438:1–7PubMedCrossRefGoogle Scholar
  41. Moore RC, Purugganan MD (2005) The evolutionary dynamics of plant duplicate genes. Curr Opin Plant Biol 8:122–128PubMedCrossRefGoogle Scholar
  42. Muller HJ (1940) Bearings of the ‘Drosophila’ work on systematics. In: Huxley J (ed) The new systematics. Clarendon Press, Oxford, pp 185–268Google Scholar
  43. Nei M, Gojobori T (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3:418–426PubMedGoogle Scholar
  44. Nozawa M, Nei M (2007) Evolutionary dynamics of olfactory receptor genes in Drosophila species. Proc Natl Acad Sci USA 104:7122–7127PubMedCrossRefGoogle Scholar
  45. Nozawa M, Aotsuka T, Tamura K (2005) A novel chimeric gene, siren, with retroposed promoter sequence in the Drosophila bipectinata complex. Genetics 171:1719–1727PubMedCrossRefGoogle Scholar
  46. Ohno S (1970) Evolution by gene duplication. Springer-Verlag, New YorkGoogle Scholar
  47. Osada N, Innan H (2008) Duplication and gene conversion in the Drosophila melanogaster genome. PLoS Genet 4:e1000305PubMedCrossRefGoogle Scholar
  48. Papp B, Pal C, Hurst LD (2003) Dosage sensitivity and the evolution of gene families in yeast. Nature 424:194–197PubMedCrossRefGoogle Scholar
  49. Petes TD, Fink GR (1982) Gene conversion between repeated genes. Nature 300:216–217PubMedCrossRefGoogle Scholar
  50. Petrov D, Hartl D (2000) Pseudogene evolution and natural selection for a compact genome. J Hered 91:221–227PubMedCrossRefGoogle Scholar
  51. Popadic A, Popadic D, Anderson W (1995) Interchromosomal exchange of genetic information between gene arrangements on the third chromosome of Drosophila pseudoobscura. Mol Biol Evol 12:938–943PubMedGoogle Scholar
  52. Powell JR (1992) Inversion polymorphisms in Drosophila pseudoobscura and Drosophila persimilis. In: Krimbas CB, Powell JR (eds) Drosophila inversion polymorphism. CRC Press, Boca Raton, pp 73–126Google Scholar
  53. Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, Nielsen R, Thornton K, Hubisz MJ, Chen R, Meisel RP, Couronne O, Hua S, Smith MA, Zhang P, Liu J, Bussemaker HJ, van Batenburg MF, Howells SL, Scherer SE, Sodergren E, Matthews BB, Crosby MA, Schroeder AJ, Ortiz-Barrientos D, Rives CM, Metzker ML, Muzny DM, Scott G, Steffen D, Wheeler DA, Worley KC, Havlak P, Durbin KJ, Egan A, Gill R, Hume J, Morgan MB, Miner G, Hamilton C, Huang Y, Waldron L, Verduzco D, Clerc-Blankenburg KP, Dubchak I, Noor MAF, Anderson W, White KP, Clark AG, Schaeffer SW, Gelbart W, Weinstock GM, Gibbs RA (2005) Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. Genome Res 15:1–18PubMedCrossRefGoogle Scholar
  54. Schaeffer SW, Goetting-Minesky MP, Kovacevic M, Peoples JR, Graybill JL, Miller JM, Kim K, Nelson JG, Anderson WW (2003) Evolutionary genomics of inversions in Drosophila pseudoobscura: evidence for epistasis. Proc Natl Acad Sci USA 100:8319–8324PubMedCrossRefGoogle Scholar
  55. Schaeffer SW, Bhutkar A, McAllister BF, Matsuda M, Matzkin LM, O’Grady PM, Rohde C, Valente VLS, Aguade M, Anderson WW, Edwards K, Garcia ACL, Goodman J, Hartigan J, Kataoka E, Lapoint RT, Lozovsky ER, Machado CA, Noor MAF, Papaceit M, Reed LK, Richards S, Rieger TT, Russo SM, Sato H, Segarra C, Smith DR, Smith TF, Strelets V, Tobari YN, Tomimura Y, Wasserman M, Watts T, Wilson R, Yoshida K, Markow TA, Gelbart WM, Kaufman TC (2008) Polytene chromosomal maps of 11 Drosophila species: the order of genomic scaffolds inferred from genetic and physical maps. Genetics 179:1601–1655PubMedCrossRefGoogle Scholar
  56. Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M, Navin N, Lucito R, Healy J, Hicks J, Ye K, Reiner A, Gilliam TC, Trask B, Patterson N, Zetterberg A, Wigler M (2004) Large-scale copy number polymorphism in the human genome. Science 305:525–528PubMedCrossRefGoogle Scholar
  57. Semple C, Wolfe KH (1999) Gene duplication and gene conversion in the Caenorhabditis elegans genome. J Mol Evol 48:555–564PubMedCrossRefGoogle Scholar
  58. Seoighe C, Gehring C (2004) Genome duplication led to highly selective expansion of the Arabidopsis thaliana proteome. Trends Genet 20:461–464PubMedCrossRefGoogle Scholar
  59. Sidow A (1996) Gen(om)e duplications in the evolution of early vertebrates. Curr Opin Genet Dev 6:715–722PubMedCrossRefGoogle Scholar
  60. Slightom JL, Blechl AE, Smithies O (1980) Human fetal gγ- and Aγ-globin genes: complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes. Cell 21:627–638PubMedCrossRefGoogle Scholar
  61. Smit AFA, Hubley R, Green P (2004) RepeatMasker Open-3.0Google Scholar
  62. Sokal RR, Rohlf FJ (1995) Biometry. W.H. Freeman and Co., New YorkGoogle Scholar
  63. Tajima F (1993) Simple methods for testing the molecular evolutionary clock hypothesis. Genetics 135:599–607PubMedGoogle Scholar
  64. Teshima KM, Innan H (2004) The effect of gene conversion on the divergence between duplicated genes. Genetics 166:1553–1560PubMedCrossRefGoogle Scholar
  65. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680PubMedCrossRefGoogle Scholar
  66. Thornton K, Long M (2005) Excess of amino acid substitutions relative to polymorphism between X-linked duplications in Drosophila melanogaster. Mol Biol Evol 22:273–284PubMedCrossRefGoogle Scholar
  67. Turner TL, Levine MT, Eckert ML, Begun DJ (2008) Genomic analysis of adaptive differentiation in Drosophila melanogaster. Genetics 179:455–473PubMedCrossRefGoogle Scholar
  68. Wang Y, Gu X (2001) Functional divergence in the caspase gene family and altered functional constraints: statistical analysis and prediction. Genetics 158:1311–1320PubMedGoogle Scholar
  69. Yang S, Arguello JR, Li X, Ding Y, Zhou Q, Chen Y, Zhang Y, Zhao R, Brunet F, Peng L, Long M, Wang W (2008) Repetitive element-mediated recombination as a mechanism for new gene origination in Drosophila. PLoS Genet 4:e3PubMedCrossRefGoogle Scholar
  70. Zhang J (2003) Evolution by gene duplication: an update. Trends Ecol Evol 18:292–298CrossRefGoogle Scholar
  71. Zhang Z, Schwartz S, Wagner L, Miller W (2000) A greedy algorithm for aligning DNA sequences. J Comput Biol 7:203–214PubMedCrossRefGoogle Scholar
  72. Zhang Y, Sturgill D, Parisi M, Kumar S, Oliver B (2007) Constraint and turnover in sex-biased gene expression in the genus Drosophila. Nature 450:233–237PubMedCrossRefGoogle Scholar
  73. Zhou Q, Zhang G, Zhang Y, Xu S, Zhao R, Zhan Z, Li X, Ding Y, Yang S, Wang W (2008) On the origin of new genes in Drosophila. Genome Res 18:1446–1455PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Intercollege Graduate Program in Genetics and Department of BiologyThe Pennsylvania State UniversityUniversity ParkUSA
  2. 2.Department of Molecular Biology and GeneticsCornell UniversityIthacaUSA

Personalised recommendations