, Volume 118, Issue 2–3, pp 217–231 | Cite as

Modular Assembly of Genes and the Evolution of New Functions

  • László Patthy


Modular assembly of novel genes from existing genes has long been thought to be an important source of evolutionary novelty. Thanks to major advances in genomic studies it has now become clear that this mechanism contributed significantly to the evolution of novel biological functions in different evolutionary lineages. Analyses of completely sequenced bacterial, archaeal and eukaryotic genomes has revealed that modular assembly of novel constituents of various eukaryotic intracellular signalling pathways played a major role in the evolution of eukaryotes. Comparison of the genomes of single-celled eukaryotes, multicellular plants and animals has also shown that the evolution of multicellularity was accompanied by the assembly of numerous novel extracellular matrix proteins and extracellular signalling proteins that are absolutely essential for multicellularity. There is now strong evidence that exon-shuffling played a general role in the assembly of the modular proteins involved in extracellular communications of metazoa. Although some of these proteins seem to be shared by all major groups of metazoa, others are restricted to certain evolutionary lineages. The genomic features of the chordates appear to have favoured intronic recombination as evidenced by the fact that exon-shuffling continued to be a major source of evolutionary novelty during vertebrate evolution.

exon-shuffling extracellular signalling intracellular signalling introns modular protein evolution 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Adami, C., C. Ofria, & T.C. Collier, 2000. Evolution of biological complexity. Proc. Natl. Acad. Sci. USA 97: 4463-4468.PubMedGoogle Scholar
  2. Adams, M.D., S.E. Celniker, R.A. Holt, C.A. Evans, J.D. Gocayne et al., 2000. The genome sequence of Drosophila melanogaster. Science 287: 2185-2195.Google Scholar
  3. Al-Sharif, W.Z., J.O. Sunyer, J.D. Lambris & L.C. Smith, 1998. Sea urchin coelomocytes specifically express a homologue of the complement component C3. J. Immunol. 160: 2983-2997.PubMedGoogle Scholar
  4. Aravind, L. & G. Subramanian, 1999. Origin of multicellular eukaryotes-insights from proteome comparisons. Curr. Opin. Genet. Dev. 9: 688-694.PubMedGoogle Scholar
  5. Arnold, J.M., C. Kennett & M.F. Lavin, 1997. Transient expression of a novel serine protease in the ectoderm of the ascidian Herdmania momus during development. Dev. Genes Evol. 206: 455-463.Google Scholar
  6. Bakal, C.J. & J.E. Davies, 2000. No longer an exclusive club: eukaryotic signalling domains in bacteria. Trends Cell Biol. 10: 32-38.PubMedGoogle Scholar
  7. Banfield, D.K., D.M. Irwin, D.A. Walz & R.T.A. MacGillivray, 1994. Evolution of prothrombin: isolation and characterization of the cDNAs encoding chicken and hagfish prothrombin. J. Mol. Evol. 38: 177-187.PubMedGoogle Scholar
  8. Bányai, L., A. Váradi & L. Patthy, 1983. Common evolutionary origin of the fibrin-binding structures of fibronectin and tissue-type plasminogen activator. FEBS Lett. 163: 37-41.PubMedGoogle Scholar
  9. Bassett Jr., D.E., M.A. Basrai, C. Connelly, K.M. Hyland, K. Kitagawa et al., 1996. Exploiting the complete yeast genome sequence. Curr. Opin. Genet. Dev. 6: 763-766.PubMedGoogle Scholar
  10. Blattner, F.R., G. Plunkett, III, C.A. Bloch, N.T. Perna, V. Burland et al., 1997. The complete genome sequence of Escherichia coli K-12. Science 277: 1453-1462.Google Scholar
  11. Bork, P., J. Schultz & C.P. Ponting, 1997. Cytoplasmic signalling domains: the next generation. Trends Biochem. Sci. 22: 296-298.PubMedGoogle Scholar
  12. Bult, C.J., O. White, G.J. Olsen, L. Zhou, R.D. Fleischmann et al., 1996. Complete genome sequence of the methanogenic Archaeon, Methanococcus jannaschii. Science 273: 1058-1073.PubMedGoogle Scholar
  13. Cameron, R.A., G. Mahairas, J.P. Rast, P. Martinez, T.R. Biondi et al., 2000. A sea urchin genome project: sequence scan, virtual map, and additional resources. Proc. Natl. Acad. Sci. USA 97: 9514-9518.PubMedGoogle Scholar
  14. Cavalier-Smith, T., 1985. The Evolution of Genome Size. Wiley, New York.Google Scholar
  15. Chervitz, S.A., L. Aravind, G. Sherlock, C.A. Ball, E.V. Koonin, et al., 1998. Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 282: 2022-2028.PubMedGoogle Scholar
  16. Claverie, J.M., 2001. Gene number. What if there are only 30,000 human genes? Science 291: 1255-1257.PubMedGoogle Scholar
  17. Copley, R.R., J. Schultz, C.P. Ponting & P. Bork, 1999. Protein families in multicellular organisms. Curr. Opin. Struct. Biol. 9: 408-415.PubMedGoogle Scholar
  18. de Chateau, M. & L. Bjorck, 1994. Protein PAB, a mosaic albuminbinding bacterial protein representing the first contemporary example of module shuffling. J. Biol. Chem. 269: 12147-12151.PubMedGoogle Scholar
  19. de Chateau, M. & L. Bjorck, 1996. Identification of interdomain sequences promoting the intronless evolution of a bacterial protein family. Proc. Natl. Acad. Sci. USA 93: 8490-8495.PubMedGoogle Scholar
  20. Dunham, I., N. Shimizu, B.A. Roe, S. Chissoe, A.R. Hunt et al., 1999. The DNA sequence of human chromosome 22. Nature 402: 489-495.Google Scholar
  21. Duret, L., D. Mouchiroud & C. Gautier, 1995. Statistical analysis of vertebrate sequences reveals that long genes are scarce in GCrich isochores. J. Mol. Evol. 40: 308-317.PubMedGoogle Scholar
  22. Fleischmann, R.D., M.D. Adams, O. White, R.A. Clayton, E.F. Kirkness et al., 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269: 496-512.Google Scholar
  23. Fraser, C.M., J.D. Gocayne, O. White, M.D. Adams, R.A. Clayton et al., 1995. The minimal gene complement of Mycoplasma genitalium. Science 270: 397-403.Google Scholar
  24. Gilbert, W., 1978. Why genes in pieces? Nature 271: 501.Google Scholar
  25. Gerstein, M., 1997. A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. J. Mol. Biol. 274: 562-576.PubMedGoogle Scholar
  26. Gerstein, M. & M. Levitt, 1997. A structural census of the current population of protein sequences. Proc. Natl. Acad. Sci. USA 94: 11911-11916.PubMedGoogle Scholar
  27. Herz, J., U. Hamann, S. Rogne, O. Myklebost, H. Gausepohl et al., 1988. Surface location and high affinity for calcium of a 500 kd liver membrane protein closely related to the LDL-receptor suggest a physiological role as lipoprotein receptor. EMBO J. 7: 4119-4127.PubMedGoogle Scholar
  28. Hogenesch, J.B., K.A. Ching, S. Batalov, A.I. Su, J.R. Walker et al., 2001. A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes. Cell 106: 413-415.PubMedGoogle Scholar
  29. Hutter, H., B.E. Vogel, J.D. Plenefisch, C.R. Norris, R.B. Proenca et al., 2000. Conservation and novelty in the evolution of cell adhesion and extracellular matrix genes. Science 287: 989-994.PubMedGoogle Scholar
  30. Hynes, R.O. & Q. Zhao, 2000. The evolution of cell adhesion. J. Cell Biol. 150: F89-F96.PubMedGoogle Scholar
  31. Jasny, B.R., 2000. The universe of Drosophila genes. Science 287: 2181.PubMedGoogle Scholar
  32. Jeong, H., S.P. Mason, A.L. Barabasi & Z.N. Oltvai, 2001. Lethality and centrality in protein networks. Nature 411: 41-42.PubMedGoogle Scholar
  33. Ji, X., K. Azumi, M. Sasaki & M. Nonaka, 1997. Ancient origin of the complement lectin pathway revealed by molecular cloning of mannan binding protein-associated serine protease from a urochordate, the Japanese ascidian, Halocynthia roretzi. Proc. Natl. Acad. Sci. USA 94: 6340-6345.PubMedGoogle Scholar
  34. Klenk, H.P., R.A. Clayton, J.F. Tomb, O. White, K.E. Nelson et al., 1997. The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 390: 364-370.PubMedGoogle Scholar
  35. Kunst, F., N. Ogasawara, I. Moszer, A.M. Albertini, G. Alloni et al., 1997. The complete genome sequence of the Gram-positive bacterium Bacillus subtilis. Nature 390: 248-255.Google Scholar
  36. Kusche-Gullberg, M., K. Garrison, A.J. MacKrell, L.I. Fessler & J.H. Fessler, 1992. Laminin A chain: expression during Drosophila development and genomic sequence. EMBO J. 11: 4519-4527.PubMedGoogle Scholar
  37. Lander, E.S., L.M. Linton, B. Birren, C. Nusbaum, M.C. Zody et al., 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.PubMedGoogle Scholar
  38. Li, W.H., Z. Gu, H. Wang & A. Nekrutenko, 2001. Evolutionary analyses of the human genome. Nature 409: 847-849.PubMedGoogle Scholar
  39. Liang, F., I. Holt, G. Pertea, S. Karamycheva, S.L. Salzberg et al., 2000. Gene index analysis of the human genome estimates approximately 120,000 genes. Nat. Genet. 25: 239-240.PubMedGoogle Scholar
  40. MacKrell, A.J., M. Kusche-Gullberg & K. Garrison, 1993. Novel Drosophila laminin A chain reveals structural relationships between laminin subunits. FASEB J. 376: 375-381.Google Scholar
  41. Miyata, T. & N. Suga, 2001. Divergence pattern of animal gene families and relationship with the Cambrian explosion. BioEssays 23: 1018-1027.PubMedGoogle Scholar
  42. Nonaka, M., K. Azumi, X. Ji, C. Namikawa-Yamada, M. Sasaki et al., 1999. Opsonic complement component C3 in the solitary ascidian, Halocynthia roretzi. J. Immunol. 162: 387-391.PubMedGoogle Scholar
  43. Ny, T., F. Elgh & B. Lund, 1984. The structure of human tissuetype plasminogen activator gene: correlation of intron and exon structures to functional and structural domains. Proc. Natl. Acad. Sci. USA 81: 5355-5359.PubMedGoogle Scholar
  44. Ono, K., H. Suga, N. Iwabe, K. Kuma & T. Miyata, 1999. Multiple protein tyrosine phosphatases in sponges and explosive gene duplication in the early evolution of animals before the parazoan-eumetazoan split. J. Mol. Evol. 48: 654-662.PubMedGoogle Scholar
  45. Patthy, L., 1985. Evolution of the proteases of blood coagulation and fibrinolysis by assembly from modules. Cell 41: 657-663.PubMedGoogle Scholar
  46. Patthy, L., 1987. Intron-dependent evolution: preferred types of exons and introns. FEBS Lett. 214: 1-7.Google Scholar
  47. Patthy, L., 1991. Modular exchange principles in proteins. Curr. Opin. Struct. Biol. 1: 351-361.Google Scholar
  48. Patthy, L., 1994. Exons and introns. Curr. Opin. Struct. Biol. 4: 383-392.Google Scholar
  49. Patthy, L., 1995. Protein evolution by exon-shuffling. Molecular Biology Intelligence Unit. R.G. Landes Company, Springer-Verlag, New York, Berlin, Heidelberg, London, Paris, Tokyo, Hong Kong, Barcelona, Budapest.Google Scholar
  50. Patthy, L., 1996a. Evolution of human proteins by exon-shuffling, pp. 35-71 in Human Genome Evolution, edited by M. Jackson, T. Strachan & G.A. Dover, Human Molecular Genetics Series BIOS Scientific Publishers Ltd., Oxford.Google Scholar
  51. Patthy, L., 1996b. Exon shuffling and other ways of module exchange. Matrix Biol. 15: 301-310.PubMedGoogle Scholar
  52. Patthy, L., 1999a. Genome evolution and the evolution of exonshuffling-a review. Gene 238: 103-114.PubMedGoogle Scholar
  53. Patthy, L., 1999b. Protein Evolution. Blackwell Science Ltd., Oxford.Google Scholar
  54. Pennisi, E., 2000. Human Genome Project. And the gene number is...? Science 288: 1146-1147.PubMedGoogle Scholar
  55. Plowman, G.D., S. Sudarsanam, J. Bingham, D. Whyte & T. Hunter, 1999. The protein kinases of Caenorhabditis elegans: a model for signal transduction in multicellular organisms. Proc. Natl. Acad. Sci. USA 96: 3603-3610.Google Scholar
  56. Raychowdhury, R., J.L. Niles, R.T. McCluskey & J.A. Smith, 1989. Autoimmune target in Heymann Nephritis is a glycoprotein with homology to the LDL receptor. Science 244: 1163-1165.PubMedGoogle Scholar
  57. Rubin, G.M., M.D. Yandell, J.R. Wortman, G.L. Gabor Miklos, C.R. Nelson et al., 2000. Comparative genomics of the eukaryotes. Science 24(287): 2204-2215.Google Scholar
  58. Saito, A., S. Pietromonaco, A.K. Loo & M.G. Farquhar, 1994. Complete cloning and sequencing of rat gp330/'megalin', a distinctive member of the low density lipoprotein receptor gene family. Proc. Natl. Acad. Sci. USA 91: 9725-9729.PubMedGoogle Scholar
  59. Sheehan, J., M. Templer, M. Gregory, R. Hanumanthaiah, D. Troyer et al., 2001. Demonstration of the extrinsic coagulation pathway in teleostei: identification of zebrafish coagulation factor VII. Proc. Natl. Acad. Sci. USA 98: 8768-8773.PubMedGoogle Scholar
  60. Shimeld, S.M., 1998. Characterization of AmphiF-spondin reveals the modular evolution of chordate F-spondin genes. Mol. Biol. Evol. 15: 1218-1223.PubMedGoogle Scholar
  61. Sidow, A., 1996. Gen(om)e duplications in the evolution of early vertebrates. Curr. Opin. Genet. Dev. 6: 715-722.PubMedGoogle Scholar
  62. Simmen, M.W., S. Leitgeb, V.H. Clark, S.J.M. Jones & A. Bird, 1998. Gene number in an invertebrate chordate, Ciona intestinalis. Proc. Natl. Acad. Sci. USA 95: 4437-4440.PubMedGoogle Scholar
  63. Smaglik, P., 2000. Researchers take a gamble on the human genome. Nature 405: 264.Google Scholar
  64. Smith, L.C., C.S. Shih & S.G. Dachenhausen, 1998. Coelomocytes express SpBf, a homologue of factor B, the second component in the sea urchin complement system. J. Immunol. 161: 6784-6793.PubMedGoogle Scholar
  65. Szathmary, E., F. Jordan & C. Pal, 2001. Can genes explain biological complexity? Science 292: 1315-1316.PubMedGoogle Scholar
  66. Suga, H., M. Koyanagi, D. Hoshiyama, K. Ono, N. Iwabe et al., 1999. Extensive gene duplication in the early evolution of animals before the parazoan-eumetazoan split demonstrated by G proteins and protein tyrosine kinases from sponge and hydra. J. Mol. Evol. 48: 646-653.PubMedGoogle Scholar
  67. The Arabidopsis Genome Initiative, 1999. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815.Google Scholar
  68. The C. elegans Sequencing Consortium, 1998. Genome sequence of the nematode C. elegans. A platform for investigating biology. Science 282: 2012-2018.Google Scholar
  69. Venter, J.C., M.D. Adams, E.W. Myers, P.W. Li, R.J. Mural et al., 2001. The sequence of the human genome. Science 291: 1304-1351.Google Scholar
  70. Yochem, J. & I. Greenwald, 1993. A gene for a low density lipoprotein-related protein in the nematode Caenorhabditis elegans. Proc. Natl. Acad. Sci. USA 90: 4572-4576.PubMedGoogle Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • László Patthy
    • 1
  1. 1.Institute of Enzymology, Biological Research CenterHungarian Academy of SciencesBudapestHungary

Personalised recommendations