Abstract
Duplicated genes produce genetic variation that can influence the evolution of genomes and phenotypes. In most cases, for a duplicated gene to contribute to evolutionary novelty it must survive the early stages of divergence from its paralog without becoming a pseudogene. I examined the evolutionary dynamics of recently duplicated genes in the Drosophila pseudoobscura genome to understand the factors affecting these early stages of evolution. Paralogs located in closer proximity have higher sequence identity. This suggests that gene conversion occurs more often between duplications in close proximity or that there is more genetic independence between distant paralogs. Partially duplicated genes have a higher likelihood of pseudogenization than completely duplicated genes, but no single factor significantly contributes to the selective constraints on a completely duplicated gene. However, DNA-based duplications and duplications within chromosome arms tend to produce longer duplication tracts than retroposed and inter-arm duplications, and longer duplication tracts are more likely to contain a completely duplicated gene. Therefore, the relative position of paralogs and the mechanism of duplication indirectly affect whether a duplicated gene is retained or pseudogenized.
Similar content being viewed by others
References
Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, George RA, Lewis SE, Richards S, Ashburner M, Henderson SN, Sutton GG, Wortman JR, Yandell MD, Zhang Q, Chen LX, Brandon RC, Rogers Y-HC, Blazej RG, Champe M, Pfeiffer BD, Wan KH, Doyle C, Baxter EG, Helt G, Nelson CR, Gabor Miklos GL, Abril JF, Agbayani A, An H-J, Andrews-Pfannkoch C, Baldwin D, Ballew RM, Basu A, Baxendale J, Bayraktaroglu L, Beasley EM, Beeson KY, Benos PV, Berman BP, Bhandari D, Bolshakov S, Borkova D, Botchan MR, Bouck J, Brokstein P, Brottier P, Burtis KC, Busam DA, Butler H, Cadieu E, Center A, Chandra I, Cherry JM, Cawley S, Dahlke C, Davenport LB, Davies P, Bd Pablos, Delcher A, Deng Z, Mays AD, Dew I, Dietz SM, Dodson K, Doup LE, Downes M, Dugan-Rocha S, Dunkov BC, Dunn P, Durbin KJ, Evangelista CC, Ferraz C, Ferriera S, Fleischmann W, Fosler C, Gabrielian AE, Garg NS, Gelbart WM, Glasser K, Glodek A, Gong F, Gorrell JH, Gu Z, Guan P, Harris M, Harris NL, Harvey D, Heiman TJ, Hernandez JR, Houck J, Hostin D, Houston KA, Howland TJ, Wei M-H, Ibegwam C et al (2000) The genome sequence of Drosophila melanogaster. Science 287:2185–2195
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 25:3389–3402
Arguello JR, Chen Y, Yang S, Wang W, Long M (2006) Origination of an X-linked testes chimeric gene by illegitimate recombination in Drosophila. PLoS Genet 2:e77
Arguello JR, Fan C, Wang W, Long M (2007) Origination of chimeric genes through DNA-level recombination. Genome Dyn 3:131–146
Benovoy D, Drouin G (2009) Ectopic gene conversions in the human genome. Genomics 93:27–32
Bhutkar A, Russo SM, Smith TF, Gelbart WM (2007) Genome-scale analysis of positionally relocated genes. Genome Res 17:1880–1887
Burge C, Karlin S (1997) Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94
Byrne KP, Wolfe KH (2007) Consistent patterns of rate asymmetry and gene loss indicate widespread neofunctionalization of yeast genes after whole-genome duplication. Genetics 175:1341–1350
Chintapalli VR, Wang J, Dow JAT (2007) Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nat Genet 39:715–720
Coulombe-Huntington J, Majewski J (2007) Characterization of intron loss events in mammals. Genome Res 17:23–32
Cusack BP, Wolfe KH (2007) Not born equal: increased rate asymmetry in relocated and retrotransposed rodent gene duplicates. Mol Biol Evol 24:679–686
Demuth JP, Bie TD, Stajich JE, Cristianini N, Hahn MW (2006) The evolution of mammalian gene families. PLoS ONE 1:e85
Dopman EB, Hartl DL (2007) A portrait of copy-number polymorphism in Drosophila melanogaster. Proc Natl Acad Sci USA 104:19920–19925
Drosophila 12 Genomes Consortium (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218
Drouin G (2002) Characterization of the gene conversions between the multigene family members of the yeast genome. J Mol Evol 55:14–23
Emerson JJ, Cardoso-Moreira M, Borevitz JO, Long M (2008) Natural selection shapes genome-wide patterns of copy-number polymorphism in Drosophila melanogaster. Science 320:1629–1631
Fink GR (1987) Pseudogenes in yeast? Cell 49:5–6
Force A, Lynch M, Pickett FB, Amores A, Yan Y-l, Postlethwait J (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531–1545
Gotea V, Veeramachaneni V, Makalowski W (2003) Mastering seeds for genomic size nucleotide BLAST searches. Nucl Acids Res 31:6935–6941
Haber JE, Leung WY, Borts RH, Lichten M (1991) The frequency of meiotic recombination in yeast is independent of the number and position of homologous donor sequences: implications for chromosome pairing. Proc Natl Acad Sci USA 88:1120–1124
Hahn MW, Han MV, Han S-G (2007) Gene family evolution across 12 Drosophila genomes. PLoS Genet 3:e197
Harrison PM, Gerstein M (2002) Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol 318:1155–1174
Harrison PM, Milburn D, Zhang Z, Bertone P, Gerstein M (2003) Identification of pseudogenes in the Drosophila melanogaster genome. Nucl Acids Res 31:1033–1037
He X, Zhang J (2005) Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169:1157–1164
Heger A, Ponting CP (2007) Evolutionary rate analyses of orthologs and paralogs from 12 Drosophila genomes. Genome Res 17:1837–1849
Hughes AL (1994) The evolution of functionally novel proteins after gene duplication. Proc R Soc Lond B Biol Sci 256:119–124
Jones CD, Begun DJ (2005) Parallel evolution of chimeric fusion genes. Proc Natl Acad Sci USA 102:11373–11378
Katju V, Lynch M (2003) The structure and early evolution of recently arisen gene duplicates in the Caenorhabditis elegans genome. Genetics 165:1793–1803
Kondrashov F, Rogozin I, Wolf Y, Koonin E (2002) Selection in the evolution of gene duplications. Genome Biol 3:0008.1–0008.9
Krimbas C, Powell J (2000) Inversion polymorphisms in Drosophila. In: Singh RS, Krimbas CB (eds) Evolutionary genetics: from molecules to morphology. Cambridge University Press, Cambridge, pp 284–299
Kumar S, Tamura K, Nei M (2004) MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5:150–163
Lazzaro BP, Clark AG (2001) Evidence for recurrent paralogous gene conversion and exceptional allelic divergence in the attacin genes of Drosophila melanogaster. Genetics 159:659–671
Lin Y-S, Byrnes JK, Hwang J-K, Li W-H (2006) Codon-usage bias versus gene conversion in the evolution of yeast duplicate genes. Proc Natl Acad Sci USA 103:14412–14416
Long M, Langley CH (1993) Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. Science 260:91–95
Long M, Thornton K (2001) Gene duplication and evolution. Science 293:1551a
Long M, Betran E, Thornton K, Wang W (2003) The origin of new genes: glimpses from the young and old. Nat Rev Genet 4:865–875
Lynch M, Conery JS (2000) The evolutionary fate and consequences of duplicate genes. Science 290:1151–1155
Lynch M, Katju V (2004) The altered evolutionary trajectories of gene duplicates. Trends Genet 20:544–549
Lynch M, O’Hely M, Walsh B, Force A (2001) The probability of preservation of a newly arisen gene duplicate. Genetics 159:1789–1804
Meisel RP (2009) Repeat mediated gene duplication in the Drosophila pseudoobscura genome. Gene 438:1–7
Moore RC, Purugganan MD (2005) The evolutionary dynamics of plant duplicate genes. Curr Opin Plant Biol 8:122–128
Muller HJ (1940) Bearings of the ‘Drosophila’ work on systematics. In: Huxley J (ed) The new systematics. Clarendon Press, Oxford, pp 185–268
Nei M, Gojobori T (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3:418–426
Nozawa M, Nei M (2007) Evolutionary dynamics of olfactory receptor genes in Drosophila species. Proc Natl Acad Sci USA 104:7122–7127
Nozawa M, Aotsuka T, Tamura K (2005) A novel chimeric gene, siren, with retroposed promoter sequence in the Drosophila bipectinata complex. Genetics 171:1719–1727
Ohno S (1970) Evolution by gene duplication. Springer-Verlag, New York
Osada N, Innan H (2008) Duplication and gene conversion in the Drosophila melanogaster genome. PLoS Genet 4:e1000305
Papp B, Pal C, Hurst LD (2003) Dosage sensitivity and the evolution of gene families in yeast. Nature 424:194–197
Petes TD, Fink GR (1982) Gene conversion between repeated genes. Nature 300:216–217
Petrov D, Hartl D (2000) Pseudogene evolution and natural selection for a compact genome. J Hered 91:221–227
Popadic A, Popadic D, Anderson W (1995) Interchromosomal exchange of genetic information between gene arrangements on the third chromosome of Drosophila pseudoobscura. Mol Biol Evol 12:938–943
Powell JR (1992) Inversion polymorphisms in Drosophila pseudoobscura and Drosophila persimilis. In: Krimbas CB, Powell JR (eds) Drosophila inversion polymorphism. CRC Press, Boca Raton, pp 73–126
Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, Nielsen R, Thornton K, Hubisz MJ, Chen R, Meisel RP, Couronne O, Hua S, Smith MA, Zhang P, Liu J, Bussemaker HJ, van Batenburg MF, Howells SL, Scherer SE, Sodergren E, Matthews BB, Crosby MA, Schroeder AJ, Ortiz-Barrientos D, Rives CM, Metzker ML, Muzny DM, Scott G, Steffen D, Wheeler DA, Worley KC, Havlak P, Durbin KJ, Egan A, Gill R, Hume J, Morgan MB, Miner G, Hamilton C, Huang Y, Waldron L, Verduzco D, Clerc-Blankenburg KP, Dubchak I, Noor MAF, Anderson W, White KP, Clark AG, Schaeffer SW, Gelbart W, Weinstock GM, Gibbs RA (2005) Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. Genome Res 15:1–18
Schaeffer SW, Goetting-Minesky MP, Kovacevic M, Peoples JR, Graybill JL, Miller JM, Kim K, Nelson JG, Anderson WW (2003) Evolutionary genomics of inversions in Drosophila pseudoobscura: evidence for epistasis. Proc Natl Acad Sci USA 100:8319–8324
Schaeffer SW, Bhutkar A, McAllister BF, Matsuda M, Matzkin LM, O’Grady PM, Rohde C, Valente VLS, Aguade M, Anderson WW, Edwards K, Garcia ACL, Goodman J, Hartigan J, Kataoka E, Lapoint RT, Lozovsky ER, Machado CA, Noor MAF, Papaceit M, Reed LK, Richards S, Rieger TT, Russo SM, Sato H, Segarra C, Smith DR, Smith TF, Strelets V, Tobari YN, Tomimura Y, Wasserman M, Watts T, Wilson R, Yoshida K, Markow TA, Gelbart WM, Kaufman TC (2008) Polytene chromosomal maps of 11 Drosophila species: the order of genomic scaffolds inferred from genetic and physical maps. Genetics 179:1601–1655
Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M, Navin N, Lucito R, Healy J, Hicks J, Ye K, Reiner A, Gilliam TC, Trask B, Patterson N, Zetterberg A, Wigler M (2004) Large-scale copy number polymorphism in the human genome. Science 305:525–528
Semple C, Wolfe KH (1999) Gene duplication and gene conversion in the Caenorhabditis elegans genome. J Mol Evol 48:555–564
Seoighe C, Gehring C (2004) Genome duplication led to highly selective expansion of the Arabidopsis thaliana proteome. Trends Genet 20:461–464
Sidow A (1996) Gen(om)e duplications in the evolution of early vertebrates. Curr Opin Genet Dev 6:715–722
Slightom JL, Blechl AE, Smithies O (1980) Human fetal gγ- and Aγ-globin genes: complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes. Cell 21:627–638
Smit AFA, Hubley R, Green P (2004) RepeatMasker Open-3.0
Sokal RR, Rohlf FJ (1995) Biometry. W.H. Freeman and Co., New York
Tajima F (1993) Simple methods for testing the molecular evolutionary clock hypothesis. Genetics 135:599–607
Teshima KM, Innan H (2004) The effect of gene conversion on the divergence between duplicated genes. Genetics 166:1553–1560
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680
Thornton K, Long M (2005) Excess of amino acid substitutions relative to polymorphism between X-linked duplications in Drosophila melanogaster. Mol Biol Evol 22:273–284
Turner TL, Levine MT, Eckert ML, Begun DJ (2008) Genomic analysis of adaptive differentiation in Drosophila melanogaster. Genetics 179:455–473
Wang Y, Gu X (2001) Functional divergence in the caspase gene family and altered functional constraints: statistical analysis and prediction. Genetics 158:1311–1320
Yang S, Arguello JR, Li X, Ding Y, Zhou Q, Chen Y, Zhang Y, Zhao R, Brunet F, Peng L, Long M, Wang W (2008) Repetitive element-mediated recombination as a mechanism for new gene origination in Drosophila. PLoS Genet 4:e3
Zhang J (2003) Evolution by gene duplication: an update. Trends Ecol Evol 18:292–298
Zhang Z, Schwartz S, Wagner L, Miller W (2000) A greedy algorithm for aligning DNA sequences. J Comput Biol 7:203–214
Zhang Y, Sturgill D, Parisi M, Kumar S, Oliver B (2007) Constraint and turnover in sex-biased gene expression in the genus Drosophila. Nature 450:233–237
Zhou Q, Zhang G, Zhang Y, Xu S, Zhao R, Zhan Z, Li X, Ding Y, Yang S, Wang W (2008) On the origin of new genes in Drosophila. Genome Res 18:1446–1455
Acknowledgements
N. Hasan, B. B. Hilldorfer, R. LeGros, and R. L. Zindren helped with sorting the BLAST hits, and N. Hasan and B. B. Hilldorfer assisted in testing the recently duplicated genes for CNP. S. W. Schaeffer and J. R. Arguello provided useful discussion and comments on the manuscript. V. Gotea and W. Makalowski provided assistance with RepeatMasker and MegaBLAST, and V. Gotea also commented on the manuscript. This material is partially based on work supported by the National Science Foundation under Grant No. 0608186, awarded to RPM. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary Figure S1
Relative rates of amino acid evolution for ancestral and derived paralogs. Estimated amino acid substitutions along the ancestral (dark gray) and derived (white) lineages are graphed for each duplicated gene in the dataset; each pair of bars represents the ancestral and derived copy of a duplicated gene, respectively. Paralogs for which the ancestral and derived copy cannot be distinguished are indicated by two gray bars. Paralogs for which the derived copy contains (A) a partial coding sequence and (B) a complete coding sequence are shown, and the status of the open reading frame of the derived copy is indicated below the X-axis. A relative rate test based on a chi-square test was used to determine if the difference in amino acid substitutions in the ancestral and derived copy of each duplicated gene is significant (Tajima 1993). Paralogs for which the test could not be performed (because of too few substitutions between the paralogs) are indicated by the black bars labeled “N/A”. Paralogs for which the test is significant at P < 0.05 are indicated by a single asterisk and paralogs for which the test is significant at P < 0.005 are indicated by two asterisks. Two completely duplicated genes in which the derived copy is evolving significantly faster than the ancestral copy are labeled. Supplementary material 1 (PDF 304 kb)
Supplementary Figure S2
Tissue expression of D. melanogaster orthologs of completely duplicated genes. Tissue expression data were retrieved for the D. melanogaster orthologs of each completely duplicated gene in the D. pseudoobscura genome (Chintapalli et al. 2007). For each tissue, the number of non-degenerated and degenerated genes expressed and not expressed in that tissue is graphed. A single asterisk indicates a significant departure from independence between degeneration and expression using a G test with P < 0.05. Supplementary material 2 (PDF 272 kb)
Supplementary Table S1
Copy number polymorphism primers. Supplementary material 4 (XLS 17 kb)
Supplementary Table S2
Data on each duplicated gene. Supplementary material 5 (XLS 67 kb)
239_2009_9254_MOESM6_ESM.zip
Supplementary Data-Annotated alignments of duplicated genes. Alignments are in the MEGA format, with coding and non-coding sequences annotated. Ancestral copies are indicated by “anc” in the sequence name, and derived copies are indicated by “dup” in the sequence name. For duplicated genes where the ancestral and derived copies could not be determined, the two copies are named “copyA” and “copyB”. Coding sequences were reverse complemented in some alignments (relative to the rest of the aligned sequence) containing multiple genes to keep open reading frames in proper orientation; this was done if the coding sequences were in opposite orientations. The following genes were reverse complemented relative to the rest of the aligned sequence: CG15287, CG14860,CG8016, CG8589, CG13190, CG13063, CG11070, CG16734, CG16983. Supplementary material 6 (ZIP 198 kb)
Rights and permissions
About this article
Cite this article
Meisel, R.P. Evolutionary Dynamics of Recently Duplicated Genes: Selective Constraints on Diverging Paralogs in the Drosophila pseudoobscura Genome. J Mol Evol 69, 81–93 (2009). https://doi.org/10.1007/s00239-009-9254-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-009-9254-1