Investigating ancient duplication events in the Arabidopsis genome

  • Jeroen Raes
  • Klaas Vandepoele
  • Cedric Simillion
  • Yvan Saeys
  • Yves Van de PeerEmail author


The complete genomic analysis of Arabidopsis thaliana has shown that a major fraction of the genome consists of paralogous genes that probably originated through one or more ancient large-scale gene or genome duplication events. However, the number and timing of these duplications still remains unclear, and several different hypotheses have been put forward recently. Here, we reanalyzed duplicated blocks found in the Arabidopsis genome described previously and determined their date of divergence based on silent substitution estimations between the paralogous genes and, where possible, by phylogenetic reconstruction. We show that methods based on averaging protein distances of heterogeneous classes of duplicated genes lead to unreliable conclusions and that a large fraction of blocks duplicated much more recently than assumed previously. We found clear evidence for one large-scale gene or even complete genome duplication event somewhere between 70 to 90 million years ago. Traces pointing to a much older (probably more than 200 million years) large-scale gene duplication event could be detected. However, for now it is impossible to conclude whether these old duplicates are the result of one or more large-scale gene duplication events.

Key words

large-scale gene duplications plant genome evolution polyploidy synonymous substitution rate 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402.PubMedCrossRefGoogle Scholar
  2. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 408, 796–815.CrossRefGoogle Scholar
  3. Blanc, G., Barakat, A., Guyot, R. Cooke, R., and Delseny, M. (2000) Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell, 12, 1093–1101.PubMedGoogle Scholar
  4. Conery, J.S., and Lynch, M. (2001) Nucleotide substitutions and the evolution of duplicate genes. In Pacific Symposium on Bio-computing 2001 (Eds., Altman, R.B., Dunker, A.K., Hunter, L., Lauderdale, K. and Klein, T.E.), World Scientific, Singapore, pp. 167–178.Google Scholar
  5. Easteal, S., and Collet, C. (1994) Consistent variation in amino-acid substitution rate, despite uniformity of mutation rate: protein evolution in mammals is not neutral. Mol. Biol. Evol., 11, 643–647.PubMedGoogle Scholar
  6. Friedman, R., and Hughes, A.L. (2001) Pattern and timing of gene duplication in animal genomes. Genome Res., 11, 1842–1847.PubMedCrossRefGoogle Scholar
  7. Grant, D., Cregan, P., and Shoemaker, R.C. (2000) Genome organization in dicots: genome duplication in Arabidopsis and synteny between soybean and Arabidopsis. Proc. Natl. Acad. Sci. USA, 97, 4168–4173.PubMedCrossRefGoogle Scholar
  8. Grubbs, F. (1969) Procedures for detecting outlying observations in samples. Technometrics, 11, 1–21.CrossRefGoogle Scholar
  9. Haldane, J.B.S. (1933) The part played by recurrent mutation in evolution. Am. Nat., 67, 5–19.CrossRefGoogle Scholar
  10. Hall, T.A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser., 41, 95–98.Google Scholar
  11. Holland, P. (1992) Homeobox genes in vertebrate evolution. BioEssays, 14, 267–273.PubMedCrossRefGoogle Scholar
  12. Hughes, A.L. (1999) Phylogenies of developmentally important proteins do not support the hypothesis of two rounds of genome duplication early in vertebrate history. J. Mol. Evol., 48, 565–576.PubMedCrossRefGoogle Scholar
  13. Koch, M., Haubold, B., and Mitchell-Olds, R. (2001) Molecular systematics of the Brassicaceae: evidence from coding plastidic matK and nuclear Chs sequences. Am. J. Bot., 88, 534–544.PubMedCrossRefGoogle Scholar
  14. Kowalski, S.P., Lan, T.-H., Feldmann, K.A., and Paterson, A.H. (1994) Comparative mapping of Arabidopsis thaliana and Brassica oleracea chromosomes reveals islands of conserved organization. Genetics, 138, 499–510.PubMedGoogle Scholar
  15. Ku, H.-M., Vision, T, Liu, J., and Tanksley, S.D. (2000) Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny. Proc. Natl Acad. Sci. USA, 97, 9121–9126.PubMedCrossRefGoogle Scholar
  16. Leitch, I.J., and Bennett, M.D. (1997) Polyploidy in angiosperms. Trends Plant. Sci., 2, 470–476.CrossRefGoogle Scholar
  17. Li, W.-H. (1993) Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol., 36, 96–99.PubMedCrossRefGoogle Scholar
  18. Li, W.-H. (1997) Molecular Evolution, Sinauer Associates, Sunderland, MA.Google Scholar
  19. Lin, X., Kaul, S., Rounsley, S., Shea, T.P., Benito, M.-I., Town, C.D., Fujii, C.Y., Mason, T., Bowman, C.L., Barnstead, M., Feldblyum, T.V., Buell, C.R., Ketchum, K.A., Lee, J., Ronning, C.M., Koo, H.L., Moffat, K.S., Cronin, L.A., Shen, M., Pai, G., Van Aken, S., Umayam, L., Tallon, L.J., Gill, J.E., Adams, M.D., Carrera, A.J., Creasy, T.H., Goodman, H.M., Somerville, C.R., Copenhaver, G.P., Preuss, D., Nierman, W.C., White, O., Eisen, J.A., Salzberg, S.L., Fraser, C.M., and Venter, J.C. (1999) Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature, 402, 761–768.PubMedCrossRefGoogle Scholar
  20. Lukashin, A.V., and Borodovsky, M. (1998) GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res., 26, 1107–1115.PubMedCrossRefGoogle Scholar
  21. Lynch, M., and Conery, J.S. (2000) The evolutionary fate and consequences of duplicate genes. Science, 290, 1151–1155.PubMedCrossRefGoogle Scholar
  22. Mayer, K., Schüller, C., Wambutt, R., Murphy, G., Volckaert, G., Pohl, T. Düsterhöft, A., Stiekema, W., Entian, K.-D., Terryn, N., Harris, B., Ansorge, W., Brandt, P., Grivell, L., Rieger, M., Weichselgartner, M., de Simone, V., Obermaier, B., Mache, R., Müller, M., Kreis, M., Delseny, M., Puigdomenech, P., Watson, M., Schmidtheini, T., Reichert, B., Portatelle, D., Perez-Alonso, M., Boutry, M., Bancroft, I., Vos, P., Hoheisel, J., Zimmermann, W., Wedler, H., Ridley, P., Langham, S.-A., McCullagh, B., Bilham, L., Robben, J., Van der Schueren, J., Grymonprez, B., Chuang, Y.-J., Vandenbussche, F., Braeken, M., Weltjens, I., Voet, M., Bastiaens, I., Aert, R., Defoor, E., Weitzenegger, T., Bothe, G., Ramsperger, U., Hilbert, H., Braun, M., Holzer, E., Brandt, A., Peters, S., van Staveren, M., Dirkse, W., Moo-ijman, P., Klein Lankhorst, R., Rose, M. Hauf, J., Kötter, P., Berneiser, S., Hempel, S., Feldpausch, M., Lamberth, S., Van den Daele, H., De Keyser, A., Buysschaert, C., Gielen, J., Villarroel, R., De Clercq, R., Van Montagu, M., Rogers, J., Cronin, A., Quail, M., Bray-Allen, S., Clark, L., Foggett, J., Hall, S., Kay, M., Lennard, N., McLay, K., Mayes, R., Pettett, A., Rajandream, M.-A., Lyne, M., Benes, V., Rechmann, S., Borkova, D., Blöcker, H., Scharfe, M., Grimm, M., Löhnert, T.-H., Dose, S., de Haan, M., Maarse, A., Schäfer, M., Müller-Auer, S., Gabel, C., Fuchs, M., Fartmann, B., Granderath, K., Dauner, D., Herzl, A., Neumann, S., Argiriou, A., Vitale, D., Liguori, R., Piravandi, E., Massenet, O., Quigley, F., Clabauld, G., Mündlein, A., Felber, R., Schnabl, S., Hiller, R., Schmidt, W., Lecharny, A., Aubourg, S., Chefdor, F., Cooke, R., Berger, C., Montfort, A., Casacuberta, E., Gibbons, T., Weber, N., Vandenbol, M., Bargues, M., Terol, J., Torres, A., Perez-Perez, A., Purnelle, B., Bent, E., Johnson, S., Tacon, D., Jesse, T., Heijnen, L., Schwarz, S., Scholler, P., Heber, S., Francs, P., Bielke, C., Frishman, D., Haase, D., Lemcke, K., Mewes, H.W., Stocker, S., Zaccaria, P., Bevan, M., Wilson, R.K., de la Bastide, M., Habermann, K., Parnell, L., Dedhia, N., Gnoj, L., Schutz, K., Huang, E., Spiegel, L., Sehkon, M., Murray, J., Sheet, P., Cordes, M., Abu-Threideh, J., Stoneking, T., Kalicki, J., Graves, T., Harmon, G., Edwards, J., Latreille, P., Courtney, L., Cloud, J., Abbott, A., Scott, K., Johnson, D., Minx, P., Bentley, D., Fulton, B., Miller, N., Greco, T., Kemp, K., Kramer, J., Fulton, L., Mardis, E., Dante, M., Pepin, K., Hillier, L., Nelson, J., Spieth, J., Ryan, E., Andrews, S., Geisel, C., Layman, D., Du, H., Ali, J., Berghoff, A., Jones, K., Drone, K., Cotton, M., Joshu, C., Antonoiu, B., Zidanic, M., Strong, C., Sun, H., Lamar, B., Yordan, C., Ma, P., Zhong, J., Preston, R., Vil, D., Shekher, M., Matero, A., Shah, R., Swaby, I’K., O’Shaughnessy, A., Rodriguez, M., Hoffman, J., Till, S., Granat, S., Shohdy, N., Hasegawa, A., Hameed, A., Lodhi, M., Johnson, A., Chen, E., Marra, M., Martienssen, R., and Mc-Combie, W.R. (1999) Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature, 402, 769–777.PubMedCrossRefGoogle Scholar
  23. Nei, M., and Gojobori, T. (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol., 3, 418–426.PubMedGoogle Scholar
  24. Ohno, S. (1970) Evolution by Gene Duplication, Springer-Verlag, Berlin.Google Scholar
  25. Paterson, A.H., Lan, T.-H., Reischmann, K.P., Chang, C., Lin, Y-R., Liu, S.-C., Burow, M.D., Kowalski, S.P., Katsar, C.S., Del-Monte, T.A., Feldmann, K.A., Schertz, K.F., and Wendel, J.F. (1996) Toward a unified genetic map of higher plants, transcending the monocot—dicot divergence. Nature Genet., 14, 380–382.PubMedCrossRefGoogle Scholar
  26. Raes, J., and Van de Peer, Y. (1999) ForCon: a software tool for the conversion of sequence alignments., 6 ( Scholar
  27. Robinson-Rechavi, M., Marchand, O., Escriva, H., and Laudet, V. (2001) An ancestral whole-genome duplication may not have been responsible for the abundance of duplicated fish genes. Curr. Biol., 11, R458–R459.PubMedCrossRefGoogle Scholar
  28. Saitou, N., and Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol., 4, 406–425.PubMedGoogle Scholar
  29. Sankoff, D. (2001) Gene and genome duplication. Curr. Opin. Genet. Dev., 11, 681–684.PubMedCrossRefGoogle Scholar
  30. Skrabanek, L., and Wolfe, K.H. (1998) Eukaryote genome duplication — where’s the evidence? Curr. Opin. Genet. Dev., 8, 694–700.PubMedCrossRefGoogle Scholar
  31. Stefansky, W. (1972) Rejecting outliers in factorial designs. Technometrics, 14, 469–479.CrossRefGoogle Scholar
  32. Taylor, J.S., Van De Peer, Y., Braasch, I., and Meyer, A. (2001a) Comparative genomics provides evidence for an ancient genome duplication event in fish. Phil. Trans. R. Soc. Lond., B 356, 1661–1679.Google Scholar
  33. Taylor, J.S., Van de Peer, Y., and Meyer, A. (2001b) Genome duplication, divergent resolution and speciation. Trends Genet., 17, 299–301.PubMedCrossRefGoogle Scholar
  34. Taylor, J.S., Van de Peer, Y., and Meyer, A. (2001c) Revisiting recent challenges to the ancient fish-specific genome duplication hypothesis. Curr. Biol., 11, R1005–R1007.PubMedCrossRefGoogle Scholar
  35. Terryn, N., Heijnen, L., De Keyser, A., Van Asseldonck, M., De Clercq, R., Verbakel, H., Gielen, J., Zabeau, M., Villarroel, R., Jesse, T., Neyt, P., Hogers, R., Van den Daele, H., Ardiles, W., Schueller, C., Mayer, K., Déhais, P., Rombauts, S., Van Montagu, M., Rouzé, P., and Vos, P. (1999) Evidence for an ancient chromosomal duplication in Arabidopsis thaliana by sequencing and analyzing a 400-kb contig of the APETALA2 locus on chromosome 4. FEBS Lett., 445, 237–245.PubMedCrossRefGoogle Scholar
  36. Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673–4680.PubMedCrossRefGoogle Scholar
  37. Van de Peer, Y., and De Wachter, R. (1994) TREECON for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment. Comput. Appl. Biosci., 10, 569–570.PubMedGoogle Scholar
  38. Van de Peer, Y., Taylor, J.S., Braasch, I., and Meyer, A. (2001) The ghost of selection past: rates of evolution and functional divergence of anciently duplicated genes. J. Mol. Evol., 53, 436–446.PubMedCrossRefGoogle Scholar
  39. Van de Peer, Y., Frickey, T., Taylor, J.S., and Meyer, A. (2002) Dealing with saturation at the amino acid level: a case study based on anciently duplicated zebrafish genes. Gene, 295, 205–211.PubMedCrossRefGoogle Scholar
  40. Vandepoele, K., Saeys, Y., Simillion, C., Raes, J., and Van de Peer, Y (2002) A new tool for the Automatic Detection of Homologous Regons (ADHoRe) and its application to microcolinearity between Arabidopsis and rice. Genome Res., in press.Google Scholar
  41. 40.
    Vision, T.J., Brown, D.G., and Tanksley, S.D. (2000) The origins of genomic duplications in Arabidopsis. Science, 290, 2114–2117.PubMedCrossRefGoogle Scholar
  42. Wang, Y., and Gu, X. (2000) Evolutionary patterns of gene families generated in the early stage of vertebrates. J. Mol Evol., 51, 88–96.PubMedGoogle Scholar
  43. Wendel, J.F. (2000) Genome evolution in polyploids. Plant Mol. Biol, 42, 225–249.PubMedCrossRefGoogle Scholar
  44. Wikström, N., Savolainen, V., and Chase, M.W. (2001) Evolution of the angiosperms: calibrating the family tree. Proc. R. Soc. Lond., B 268, 2211–2220.CrossRefGoogle Scholar
  45. Wolfe, K.H. (2001) Yesterday’s polyploids and the mystery of dip-loidization. Nature Rev. Genet., 2, 333–341.PubMedCrossRefGoogle Scholar
  46. Yang, Y.-W., Lai, K.-N., Tai, P.-Y., and Li, W.-H. (1999) Rates of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J. Mol Evol., 48, 597–604.PubMedCrossRefGoogle Scholar
  47. Yang, Z. (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci., 13, 555–556.PubMedGoogle Scholar
  48. Yang, Z., and Nielsen, R. (2000) Estimating synonymous and non-synonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol., 17, 32–43.PubMedCrossRefGoogle Scholar
  49. Zeng, L.-W., Comeron, J.M., Chen, B., and Kreitman, M. (1998) The molecular clock revisited: the rate of synonymous vs. replacement change in Drosophila. Genetica, 103, 369–382.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2003

Authors and Affiliations

  • Jeroen Raes
    • 1
  • Klaas Vandepoele
    • 1
  • Cedric Simillion
    • 1
  • Yvan Saeys
    • 1
  • Yves Van de Peer
    • 1
    Email author
  1. 1.Department of Plant Systems Biology, Flanders Interuniversity Institute for BiotechnologyGhent UniversityGentBelgium

Personalised recommendations