Investigating ancient duplication events in the Arabidopsis genome

  • Jeroen Raes
  • Klaas Vandepoele
  • Cedric Simillion
  • Yvan Saeys
  • Yves Van de PeerEmail author


The complete genomic analysis of Arabidopsis thaliana has shown that a major fraction of the genome consists of paralogous genes that probably originated through one or more ancient large-scale gene or genome duplication events. However, the number and timing of these duplications still remains unclear, and several different hypotheses have been put forward recently. Here, we reanalyzed duplicated blocks found in the Arabidopsis genome described previously and determined their date of divergence based on silent substitution estimations between the paralogous genes and, where possible, by phylogenetic reconstruction. We show that methods based on averaging protein distances of heterogeneous classes of duplicated genes lead to unreliable conclusions and that a large fraction of blocks duplicated much more recently than assumed previously. We found clear evidence for one large-scale gene or even complete genome duplication event somewhere between 70 to 90 million years ago. Traces pointing to a much older (probably more than 200 million years) large-scale gene duplication event could be detected. However, for now it is impossible to conclude whether these old duplicates are the result of one or more large-scale gene duplication events. abbreviations dA, fraction of amino acid substitutions; Kn, number of nonsynonymous substitutions per nonsynonymous site; Ks, number of synonymous substitutions per synonymous site; MYA, million years ago

large-sclae gene duplications plant genome evolution polypoidy synonymous substitution rate 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997) Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389-3402.Google Scholar
  2. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 408, 796-815.Google Scholar
  3. Blanc, G., Barakat, A., Guyot, R., Cooke, R., and Delseny, M. (2000) Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell, 12, 1093-1101.Google Scholar
  4. Conery, J.S., and Lynch, M. (2001) Nucleotide substitutions and the evolution of duplicate genes. In Pacific Symposium on Biocomputing 2001 (Eds., Altman, R.B., Dunker, A.K., Hunter, L., Lauderdale, K. and Klein, T.E.), World Scientific, Singapore, pp. 167-178.Google Scholar
  5. Easteal, S., and Collet, C. (1994) Consistent variation in aminoacid substitution rate, despite uniformity of mutation rate: protein evolution in mammals is not neutral. Mol. Biol. Evol., 11, 643-647.Google Scholar
  6. Friedman, R., and Hughes, A.L. (2001) Pattern and timing of gene duplication in animal genomes. Genome Res., 11, 1842-1847.Google Scholar
  7. Grant, D., Cregan, P., and Shoemaker, R.C. (2000) Genome organization in dicots: genome duplication in Arabidopsis and synteny between soybean and Arabidopsis. Proc. Natl. Acad. Sci. USA, 97, 4168-4173.Google Scholar
  8. Grubbs, F. (1969) Procedures for detecting outlying observations in samples. Technometrics, 11, 1-21.Google Scholar
  9. Haldane, J.B.S. (1933) The part played by recurrent mutation in evolution. Am. Nat., 67, 5-19.Google Scholar
  10. Hall, T.A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser., 41, 95-98.Google Scholar
  11. Holland, P. (1992) Homeobox genes in vertebrate evolution. BioEssays, 14, 267-273.Google Scholar
  12. Hughes, A.L. (1999) Phylogenies of developmentally important proteins do not support the hypothesis of two rounds of genome duplication early in vertebrate history. J. Mol. Evol., 48, 565-576.Google Scholar
  13. Koch, M., Haubold, B., and Mitchell-Olds, R. (2001) Molecular systematics of the Brassicaceae: evidence from coding plastidic matK and nuclear Chs sequences. Am. J. Bot., 88, 534-544.Google Scholar
  14. Kowalski, S.P., Lan, T.-H., Feldmann, K.A., and Paterson, A.H. (1994) Comparative mapping of Arabidopsis thaliana and Brassica oleracea chromosomes reveals islands of conserved organization. Genetics, 138, 499-510.Google Scholar
  15. Ku, H.-M., Vision, T., Liu, J., and Tanksley, S.D. (2000) Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny. Proc. Natl. Acad. Sci. USA, 97, 9121-9126.Google Scholar
  16. Leitch, I.J., and Bennett, M.D. (1997) Polyploidy in angiosperms. Trends Plant. Sci., 2, 470-476.Google Scholar
  17. Li, W.-H. (1993) Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol., 36, 96-99.Google Scholar
  18. .Li, W.-H. (1997) Molecular Evolution, Sinauer Associates, Sunderland, MA.Google Scholar
  19. Lin, X., Kaul, S., Rounsley, S., Shea, T.P., Benito, M.-I., Town, C.D., Fujii, C.Y., Mason, T., Bowman, C.L., Barnstead, M., Feldblyum, T.V., Buell, C.R., Ketchum, K.A., Lee, J., Ronning, C.M., Koo, H.L., Moffat, K.S., Cronin, L.A., Shen, M., Pai, G., Van Aken, S., Umayam, L., Tallon, L.J., Gill, J.E., Adams, M.D., Carrera, A.J., Creasy, T.H., Goodman, H.M., Somerville, C.R., Copenhaver, G.P., Preuss, D., Nierman, W.C., White, O., Eisen, J.A., Salzberg, S.L., Fraser, C.M., and Venter, J.C. (1999) Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature, 402, 761-768.Google Scholar
  20. Lukashin, A.V., and Borodovsky, M. (1998) GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res., 26, 1107-1115.Google Scholar
  21. Lynch, M., and Conery, J.S. (2000) The evolutionary fate and consequences of duplicate genes. Science, 290, 1151-1155.Google Scholar
  22. Mayer, K., Schüller, C., Wambutt, R., Murphy, G., Volckaert, G., Pohl, T., Düsterhöft, A., Stiekema, W., Entian, K.-D., Terryn, N., Harris, B., Ansorge, W., Brandt, P., Grivell, L., Rieger, M., Weichselgartner, M., de Simone, V., Obermaier, B., Mache, R., Müller, M., Kreis, M., Delseny, M., Puigdomenech, P., Watson, M., Schmidtheini, T., Reichert, B., Portatelle, D., Perez-Alonso,, M., Boutry, M., Bancroft, I., Vos, P., Hoheisel, J., Zimmermann, W., Wedler, H., Ridley, P., Langham, S.-A., McCullagh, B., Bilham, L., Robben, J., Van der Schueren, J., Grymonprez, B., Chuang, Y.-J., Vandenbussche, F., Braeken, M., Weltjens, I., Voet, M., Bastiaens, I., Aert, R., Defoor, E., Weitzenegger, T., Bothe, G., Ramsperger, U., Hilbert, H., Braun, M., Holzer, E., Brandt, A., Peters, S., van Staveren, M., Dirkse, W., Mooijman, P., Klein Lankhorst, R., Rose, M., Hauf, J., Kötter, P., Berneiser, S., Hempel, S., Feldpausch, M., Lamberth, S., Van den Daele, H., De Keyser, A., Buysschaert, C., Gielen, J., Villarroel, R., De Clercq, R., Van Montagu, M., Rogers, J., Cronin, A., Quail, M., Bray-Allen, S., Clark, L., Foggett, J., Hall, S., Kay, M., Lennard, N., McLay, K., Mayes, R., Pettett, A., Rajandream, M.-A., Lyne, M., Benes, V., Rechmann, S., Borkova, D., Blöcker, H., Scharfe, M., Grimm, M., Löhnert, T.-H., Dose, S., de Haan, M., Maarse, A., Schäfer, M., Müller-Auer, S., Gabel, C., Fuchs, M., Fartmann, B., Granderath, K., Dauner, D., Herzl, A., Neumann, S., Argiriou, A., Vitale, D., Liguori, R., Piravandi, E., Massenet, O., Quigley, F., Clabauld, G., Mündlein, A., Felber, R., Schnabl, S., Hiller, R., Schmidt, W., Lecharny, A., Aubourg, S., Chefdor, F., Cooke, R., Berger, C., Montfort, A., Casacuberta, E., Gibbons, T., Weber, N., Vandenbol, M., Bargues, M., Terol, J., Torres, A., Perez-Perez, A., Purnelle, B., Bent, E., Johnson, S., Tacon, D., Jesse, T., Heijnen, L., Schwarz, S., Scholler, P., Heber, S., Francs, P., Bielke, C., Frishman, D., Haase, D., Lemcke, K., Mewes, H.W., Stocker, S., Zaccaria, P., Bevan, M., Wilson, R.K., de la Bastide, M., Habermann, K., Parnell, L., Dedhia, N., Gnoj, L., Schutz, K., Huang, E., Spiegel, L., Sehkon, M., Murray, J., Sheet, P., Cordes, M., Abu-Threideh, J., Stoneking, T., Kalicki, J., Graves, T., Harmon, G., Edwards, J., Latreille, P., Courtney, L., Cloud, J., Abbott, A., Scott, K., Johnson, D., Minx, P., Bentley, D., Fulton, B., Miller, N., Greco, T., Kemp, K., Kramer, J., Fulton, L., Mardis, E., Dante, M., Pepin, K., Hillier, L., Nelson, J., Spieth, J., Ryan, E., Andrews, S., Geisel, C., Layman, D., Du, H., Ali, J., Berghoff, A., Jones, K., Drone, K., Cotton, M., Joshu, C., Antonoiu, B., Zidanic, M., Strong, C., Sun, H., Lamar, B., Yordan, C., Ma, P., Zhong, J., Preston, R., Vil, D., Shekher, M., Matero, A., Shah, R., Swaby, I'K., O'shaughnessy, A., Rodriguez, M., Hoffman, J., Till, S., Granat, S., Shohdy, N., Hasegawa, A., Hameed, A., Lodhi, M., Johnson, A., Chen, E., Marra, M., Martienssen, R., and Mc-Combie, W.R. (1999) Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature, 402, 769-777.Google Scholar
  23. Nei, M., and Gojobori, T. (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol., 3, 418-426.Google Scholar
  24. Ohno, S. (1970) Evolution by Gene Duplication, Springer-Verlag, Berlin.Google Scholar
  25. Paterson, A.H., Lan, T.-H., Reischmann, K.P., Chang, C., Lin, Y.-R., Liu, S.-C., Burow, M.D., Kowalski, S.P., Katsar, C.S., Del-Monte, T.A., Feldmann, K.A., Schertz, K.F., and Wendel, J.F. (1996) Toward a unified genetic map of higher plants, transcending the monocot-dicot divergence. Nature Genet., 14, 380-382.Google Scholar
  26. Robinson-Rechavi, M., Marchand, O., Escriva, H., and Laudet, V. (2001) An ancestral whole-genome duplication may not have been responsible for the abundance of duplicated fish genes. Curr. Biol., 11, R458-R459.Google Scholar
  27. Saitou, N., and Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol., 4, 406-425.Google Scholar
  28. Sankoff, D. (2001) Gene and genome duplication. Curr. Opin. Genet. Dev., 11, 681-684.Google Scholar
  29. Skrabanek, L., and Wolfe, K.H. (1998) Eukaryote genome duplication-where's the evidence? Curr. Opin. Genet. Dev., 8, 694-700.Google Scholar
  30. Stefansky, W. (1972) Rejecting outliers in factorial designs. Technometrics, 14, 469-479.Google Scholar
  31. Taylor, J.S., Van De Peer, Y., Braasch, I., and Meyer, A. (2001a) Comparative genomics provides evidence for an ancient genome duplication event in fish. Phil. Trans. R. Soc. Lond., B 356, 1661-1679.Google Scholar
  32. Taylor, J.S., Van de Peer, Y., and Meyer, A. (2001b) Genome duplication, divergent resolution and speciation. Trends Genet., 17, 299-301.Google Scholar
  33. Taylor, J.S., Van de Peer, Y., and Meyer, A. (2001c) Revisiting recent challenges to the ancient fish-specific genome duplication hypothesis. Curr. Biol., 11, R1005-R1007.Google Scholar
  34. Terryn, N., Heijnen, L., De Keyser, A., Van Asseldonck, M., De Clercq, R., Verbakel, H., Gielen, J., Zabeau, M., Villarroel, R., Jesse, T., Neyt, P., Hogers, R., Van den Daele, H., Ardiles, W., Schueller, C., Mayer, K., Déhais, P., Rombauts, S., Van Montagu, M., Rouzé, P., and Vos, P. (1999) Evidence for an ancient chromosomal duplication in Arabidopsis thaliana by sequencing and analyzing a 400-kb contig of the APETALA2 locus on chromosome 4. FEBS Lett., 445, 237-245.Google Scholar
  35. Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673-4680.Google Scholar
  36. Van de Peer, Y., and De Wachter, R. (1994) TREECON for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment. Comput. Appl. Biosci., 10, 569-570.Google Scholar
  37. Van de Peer, Y., Taylor, J.S., Braasch, I., and Meyer, A. (2001) The ghost of selection past: rates of evolution and functional divergence of anciently duplicated genes. J. Mol. Evol., 53, 436-446.Google Scholar
  38. Van de Peer, Y., Frickey, T., Taylor, J.S., and Meyer, A. (2002) Dealing with saturation at the amino acid level: a case study based on anciently duplicated zebrafish genes. Gene , 295, 205-211.Google Scholar
  39. Vandepoele, K., Saeys, Y., Simillion, C., Raes, J., and Van de Peer, Y. (2002) A new tool for the Automatic Detection of Homologous Regons (ADHoRe) and its application to microcolinearity between Arabidopsis and rice. Genome Res., in press.Google Scholar
  40. 40.Vision, T.J., Brown, D.G., and Tanksley, S.D. (2000) The origins of genomic duplications in Arabidopsis. Science, 290, 2114-2117.Google Scholar
  41. Wang, Y., and Gu, X. (2000) Evolutionary patterns of gene families generated in the early stage of vertebrates. J. Mol. Evol., 51, 88-96.Google Scholar
  42. Wendel, J.F. (2000) Genome evolution in polyploids. Plant Mol. Biol., 42, 225-249.Google Scholar
  43. Wikström, N., Savolainen, V., and Chase, M.W. (2001) Evolution of the angiosperms: calibrating the family tree. Proc. R. Soc. Lond., B 268, 2211-2220.Google Scholar
  44. Wolfe, K.H. (2001) Yesterday's polyploids and the mystery of diploidization. Nature Rev. Genet., 2, 333-341.Google Scholar
  45. Yang, Y.-W., Lai, K.-N., Tai, P.-Y., and Li, W.-H. (1999) Rates of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J. Mol. Evol., 48, 597-604.Google Scholar
  46. Yang, Z. (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci., 13, 555-556.Google Scholar
  47. Yang, Z., and Nielsen, R. (2000) Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol., 17, 32-43.Google Scholar
  48. Zeng, L.-W., Comeron, J.M., Chen, B., and Kreitman, M. (1998) The molecular clock revisited: the rate of synonymous vs. replacement change in Drosophila. Genetica, 103, 369-382.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Jeroen Raes
  • Klaas Vandepoele
  • Cedric Simillion
    • 1
  • Yvan Saeys
    • 1
  • Yves Van de Peer
    • 1
    Email author
  1. 1.Department of Plant Systems Biology, Flanders Interuniversity Institute for BiotechnologyGhent UniversityGentBelgium

Personalised recommendations