Pseudogenes pp 87-99 | Cite as

Methods to Study the Occurrence and the Evolution of Pseudogenes Through a Phylogenetic Approach

  • Jacques Dainat
  • Pierre Pontarotti
Part of the Methods in Molecular Biology book series (MIMB, volume 1167)


During the last few years, the study of pseudogenes has excited enthusiasm, because it has been proven that at least some of them are involved in important biological processes. An accurate detection and analysis of pseudogenes can be achieved using comparative methods, but only the use of phylogenetic tools can provide accurate information about their birth, their evolution and their death, hence about the impact that they have on genes and genomes. Here, phylogenetic methods that allow studying pseudogene history are described.

Key words

Pseudogenes Phylogeny Speciation Orthologs Paralogs 



This work was supported by the French National Research Agency [EvolHHupro: ANR-07-BLAN-0054]. Thanks to Philippe Gouret and Julien Paganini for their help on the development of the strategy to detect gene losses and unitary pseudogenes.


  1. 1.
    Bischof JM, Chiang AP, Scheetz TE et al (2006) Genome-wide identification of pseudogenes capable of disease-causing gene conversion. Hum Mutat 27:545–552. doi: 10.1002/humu.20335 PubMedCrossRefGoogle Scholar
  2. 2.
    Khelifi A, Adel K, Duret L et al (2005) HOPPSIGEN: a database of human and mouse processed pseudogenes. Nucleic Acids Res 33:D59–D66. doi: 10.1093/nar/gki084 PubMedCrossRefGoogle Scholar
  3. 3.
    Torrents D, Suyama M, Zdobnov E, Bork P (2003) A genome-wide survey of human pseudogenes. Genome Res 13:2559–2567. doi: 10.1101/gr.1455503 PubMedCrossRefPubMedCentralGoogle Scholar
  4. 4.
    Zhang Z, Carriero N, Gerstein M (2004) Comparative analysis of processed pseudogenes in the mouse and human genomes. Trends Genet 20:62–67. doi: 10.1016/j.tig.2003.12.005 PubMedCrossRefGoogle Scholar
  5. 5.
    Vanin EF (1985) Processed pseudogenes: characteristics and evolution. Annu Rev Genet 19:253–272. doi: 10.1146/ PubMedCrossRefGoogle Scholar
  6. 6.
    Gerstein M, Zheng D (2006) The real life of pseudogenes. Sci Am 295:48–55. doi: 10.1038/scientificamerican0806-48 PubMedCrossRefGoogle Scholar
  7. 7.
    Satta Y (2011) Primate evolution: gene loss and inactivation. Life Sci 1–7. doi:  10.1002/9780470015902.a0005121.pub2.
  8. 8.
    Wang X, Grus WE, Zhang J (2006) Gene losses during human origins. PLoS Biol 4:e52. doi: 10.1371/journal.pbio.0040052 PubMedCrossRefPubMedCentralGoogle Scholar
  9. 9.
    Mitchell A, Graur D (2005) Inferring the pattern of spontaneous mutation from the pattern of substitution in unitary pseudogenes of Mycobacterium leprae and a comparison of mutation patterns among distantly related organisms. J Mol Evol 61:795–803. doi: 10.1007/s00239-004-0235-0 PubMedCrossRefGoogle Scholar
  10. 10.
    Li W-H, Gojobori T, Nei M (1981) Pseudogenes as a paradigm of neutral evolution. Nature 292:237–239. doi: 10.1038/292237a0 PubMedCrossRefGoogle Scholar
  11. 11.
    Nachman MW, Crowell SL (2000) Estimate of the mutation rate per nucleotide in humans. Genetics 156:297–304PubMedPubMedCentralGoogle Scholar
  12. 12.
    Weir JT, Schluter D (2008) Calibrating the avian molecular clock. Mol Ecol 17:2321–2328. doi: 10.1111/j.1365-294X.2008.03742.x PubMedCrossRefGoogle Scholar
  13. 13.
    Chan W-L, Yuo C-Y, Yang W-K et al (2013) Transcribed pseudogene ψPPM1K generates endogenous siRNA to suppress oncogenic cell growth in hepatocellular carcinoma. Nucleic Acids Res 41:3734–3747. doi: 10.1093/nar/gkt047 PubMedCrossRefPubMedCentralGoogle Scholar
  14. 14.
    Hirotsune S, Yoshida N, Chen A et al (2003) An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene. Nature 423:91–96. doi: 10.1038/nature01535 PubMedCrossRefGoogle Scholar
  15. 15.
    Wen Y-Z, Zheng L-L, Qu L-H et al (2012) Pseudogenes are not pseudo any more. RNA Biol 9:27–32. doi: 10.4161/rna.9.1.18277 PubMedCrossRefGoogle Scholar
  16. 16.
    Pink RC, Wicks K, Caley DP et al (2011) Pseudogenes: pseudo-functional or key regulators in health and disease? RNA 17:792–798. doi: 10.1261/rna.2658311 PubMedCrossRefPubMedCentralGoogle Scholar
  17. 17.
    Olson M (1999) When less is more: gene loss as an engine of evolutionary change. Am J Hum Genet 64(1):18–23. doi: 10.1086/302219 PubMedCrossRefPubMedCentralGoogle Scholar
  18. 18.
    Zhang ZD, Frankish A, Hunt T et al (2010) Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates. Genome Biol 11:R26. doi: 10.1186/gb-2010-11-3-r26 PubMedCrossRefPubMedCentralGoogle Scholar
  19. 19.
    Zhu J, Sanborn JZ, Diekhans M et al (2007) Comparative genomics search for losses of long-established genes on the human lineage. PLoS Comput Biol 3:e247. doi: 10.1371/journal.pcbi.0030247 PubMedCrossRefPubMedCentralGoogle Scholar
  20. 20.
    Costello JC, Han MV, Hahn MW (2008) Limitations of pseudogenes in identifying gene losses. In: Nelson C, Vialette S (eds), Proceedings of the sixth annual RECOMB satellite workshop on comparative genomics, Paris, France, 13–15 Oct 2008. Springer Berlin, Heidelberg, pp 14–25Google Scholar
  21. 21.
    Farris JS (1977) Phylogenetic analysis under Dollo’s law. Syst Biol 26:77–88. doi: 10.1093/sysbio/26.1.77 CrossRefGoogle Scholar
  22. 22.
    Mirkin BG, Fenner TI, Galperin MY, Koonin EV (2003) Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes. BMC Evol Biol 3:2. doi: 10.1186/1471-2148-3-2 PubMedCrossRefPubMedCentralGoogle Scholar
  23. 23.
    Sankoff D, Rousseau P (1975) Locating the vertices of a Steiner tree in an arbitrary metric space. Math Program 9:240–246. doi: 10.1007/BF01681346 CrossRefGoogle Scholar
  24. 24.
    Sankoff D (1975) Minimal mutation trees in sequences. Soc Ind Appl Math 28:35–42CrossRefGoogle Scholar
  25. 25.
    Ortutay C, Vihinen M (2008) PseudoGeneQuest – service for identification of different pseudogene types in the human genome. BMC Bioinformatics 9:299. doi: 10.1186/1471-2105-9-299 PubMedCrossRefPubMedCentralGoogle Scholar
  26. 26.
    Zhang Z, Carriero N, Zheng D et al (2006) PseudoPipe: an automated pseudogene identification pipeline. Bioinformatics 22:1437–1439. doi: 10.1093/bioinformatics/btl116 PubMedCrossRefGoogle Scholar
  27. 27.
    Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410. doi: 10.1016/S0022-2836(05)80360-2 PubMedCrossRefGoogle Scholar
  28. 28.
    Schwartz S, Kent WJ, Smit A et al (2003) Human-mouse alignments with BLASTZ. Genome Res 13:103–107. doi: 10.1101/gr.809403 PubMedCrossRefPubMedCentralGoogle Scholar
  29. 29.
    Gouret P, Thompson JD, Pontarotti P (2009) PhyloPattern: regular expressions to identify complex patterns in phylogenetic trees. BMC Bioinformatics 10:298. doi: 10.1186/1471-2105-10-298 PubMedCrossRefPubMedCentralGoogle Scholar
  30. 30.
    Sukumaran J, Holder MT (2010) DendroPy: a Python library for phylogenetic computing. Bioinformatics 26:1569–1571. doi: 10.1093/bioinformatics/btq228 PubMedCrossRefGoogle Scholar
  31. 31.
    Vos RA, Caravas J, Hartmann K et al (2011) BIO::Phylo-phyloinformatic analysis using perl. BMC Bioinformatics 12:63. doi: 10.1186/1471-2105-12-63 PubMedCrossRefPubMedCentralGoogle Scholar
  32. 32.
    Dainat J, Paganini J, Pontarotti P, Gouret P (2012) GLADX: an automated approach to analyze the lineage-specific loss and pseudogenization of genes. PLoS One 7:e38792. doi: 10.1371/journal.pone.0038792 PubMedCrossRefPubMedCentralGoogle Scholar
  33. 33.
    Gouret P, Paganini J, Dainat J et al (2011) Integration of evolutionary biology concepts for functional annotation and automation of complex research in evolution: the multi-agent software system DAGOBAH. In: Pontaro P (ed) Evolutionary biology – concepts, biodiversity, macroevolution genome evolution. Springer, Heidelberg, pp 71–87. doi: 10.1007/978-3-642-20763-1
  34. 34.
    Larkin MA, Blackshields G, Brown NP et al (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23:2947–2948. doi: 10.1093/bioinformatics/btm404 PubMedCrossRefGoogle Scholar
  35. 35.
    Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. doi: 10.1093/nar/gkh340 PubMedCrossRefPubMedCentralGoogle Scholar
  36. 36.
    Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217. doi: 10.1006/jmbi.2000.4042 PubMedCrossRefGoogle Scholar
  37. 37.
    Schmidt HA, Strimmer K, Vingron M, von Haeseler A (2002) TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504PubMedCrossRefGoogle Scholar
  38. 38.
    Talavera G, Castresana J (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56:564–577. doi: 10.1080/10635150701472164 PubMedCrossRefGoogle Scholar
  39. 39.
    Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25:1972–1973. doi: 10.1093/bioinformatics/btp348 PubMedCrossRefPubMedCentralGoogle Scholar
  40. 40.
    Paganini J, Gouret P (2012) Reliable phylogenetic trees building: a new web interface for FIGENIX. Evol Bioinform Online 8:417–421. doi: 10.4137/EBO.S9179 PubMedPubMedCentralGoogle Scholar
  41. 41.
    Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690. doi: 10.1093/bioinformatics/btl446 PubMedCrossRefGoogle Scholar
  42. 42.
    Guindon S, Dufayard J-F, Lefort V et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321. doi: 10.1093/sysbio/syq010 PubMedCrossRefGoogle Scholar
  43. 43.
    Levasseur A, Paganini J, Dainat J et al (2012) The chordate proteome history database. Evol Bioinform Online 8:437–447. doi: 10.4137/EBO.S9186 PubMedCrossRefPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Evolutionary Biology and Modeling GroupAix-Marseille UniversitéMarseille Cedex 3France
  2. 2.Team DAVEM, UMR AGAP 1334Montpellier SupAgroMontpellier Cedex 1France

Personalised recommendations