Skip to main content

Prediction of Functional Sites in Proteins by Evolutionary Methods

  • Chapter
Book cover Methods in Proteome and Protein Analysis

Part of the book series: Principles and Practice ((PRINCIPLES))

Abstract

Functional sites are well-defined regions that are relevant for protein function, and that include characteristic groups of amino acids. These regions may be involved in the interaction between proteins and other molecules, such as other proteins, nucleic acids, small ligands and substrates. Interaction sites have been studied in great detail in representative protein families, and their relationship with natural substrates and drugs has been characterized, as well as their mediation in protein complex formation. In many cases they have been studied in relation to their potential for engineering protein activity. Protein binding sites have also been studied at a more general level by characterizing the typical structure of binding sites, and their general residue preferences. However, it is the relationship between the conservation of sequence features and protein active sites and binding sites that constitutes the basis of the development of prediction methods. The conservation of the chemical characteristics of the amino acids in specific groups of sequences, in the context of large protein families, is a particular method used in a growing collection of methods aimed at predicting protein binding sites at a genomic scale. In this review we analyze these methods, discuss their similarities, and describe a number of key unsolved problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aloy P, Querol E, Aviles FX, Sternberg MJ. Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J Mol Biol. 2001, 311 (2): 395–408

    Article  PubMed  CAS  Google Scholar 

  • Altschuh D, Lesk AM, Bloomer AC, Klug A. Correlation of coordinated amino acid substitutions with function in virus related to tobacco mosaic virus. J. Mol. Biol. 1987, 193: 693–707

    Article  PubMed  CAS  Google Scholar 

  • Andrade MA, Casari G, Sander C, Valencia A. Classification of protein families and detection of the determinant residues with an improved self-organizing map. Biol Cybern. 1997, 76: 441–450

    Article  PubMed  CAS  Google Scholar 

  • Armon A, Graur D, Ben-Tal N. Con Surf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J Mol Biol. 2001, 307: 447–463

    Article  PubMed  CAS  Google Scholar 

  • Atchley, W. R., Terhalle, W., Dress, A. Positional dependence, cliques and predictive motifs in the bHLH protein domain. J. Mol. Evol. 1999, 48: 501–516

    Article  PubMed  CAS  Google Scholar 

  • Atchley, W. R., Wollenberg, K. R., Fitch, W. M., Terhalle, W. Correlations among amino acid sites in bHLH protein domains: an information theoretic analysis. Mol. Biol. Evol. 2000, 17: 164–178

    Article  PubMed  CAS  Google Scholar 

  • Azuma Y, Renault L, Garcia-Ranea JA, Valencia A, Nishimoto T, Wittinghofer A. Model of the Ran-RCC1 interaction using biochemical and docking experiments. Journal of Molecular Biology. 1999, 289: 1119–1130

    Article  PubMed  CAS  Google Scholar 

  • Bauer B, Mirey G, Vetter IR, Garcia-Ranea JA, Valencia A, Wittinghofer A, Camonis JH, Cool RH. Effector recognition by the small GTP-binding proteins Ras and Ral. Journal of Biological Chemistry. 1999, 274: 17763–17770

    Article  PubMed  CAS  Google Scholar 

  • Bazan JF, KochNolte F. Sequence and structural links between distant ADP- ribosyltransferase families. In Adp-Ribosylation in Animal Tissues. Edited by; 1997: 99–107.

    Google Scholar 

  • Bazan JF. Helical fold prediction for the cyclin box. Proteins-Structure Function and Genetics. 1996, 24: 1–17

    CAS  Google Scholar 

  • Blomberg N, Nilges M. Functional diversity of PH domains: an exhaustive modelling study. Folding and Design. 1997, 2: 343–355

    Article  PubMed  CAS  Google Scholar 

  • Chap. 22 Prediction and Functional Sites in Proteins by Evolutionary Methods 337

    Google Scholar 

  • Bogan AA, Thorn KS. Anatomy of hot spots in protein interfaces. J Mol Biol. 1998, 280 (1): 1–9

    Article  PubMed  CAS  Google Scholar 

  • Casari G, Sander, C., Valencia, A. A method to predict functional residues in proteins. Nature Struct Biol. 1995, 2: 171–178

    Article  PubMed  CAS  Google Scholar 

  • Clackson T, Wells JA. A hot spot of binding energy in a hormone-receptor interface. Science. 1995, 267 (5196): 383–6

    Article  PubMed  CAS  Google Scholar 

  • Clarke, N. D. Covariation of residues in the homeodomain sequence family. Protein Sci. 1995, 4: 2269–2278

    Article  PubMed  CAS  Google Scholar 

  • del Porto P, Puntoriero G, Scotta C, Nicosia A, Piccolella E. High prevalence of hypervariable region 1-specific and cross-reactive CD4(+) T cells in HCV-infected individuals responsive to IFN-alpha treatment. Virology. 2000, 269: 313–324

    Article  PubMed  Google Scholar 

  • del Sol, A., Pazos, F., Valencia, A. Automatic methods for predicting functionally important residues. J. Mol. Biol. 2003, 326: 1289–1302

    Article  PubMed  Google Scholar 

  • de Rinaldis M, Ausiello G, Cesareni G, Helmer-Citterich M. Three-dimensional profiles: a new tool to identify protein surface similarities. J Mol Biol. 1998, 284: 1211–1221

    Article  PubMed  Google Scholar 

  • Devos D, Valencia A. Practical limits of function prediction. Proteins. 2000, 41: 98–107

    Article  PubMed  CAS  Google Scholar 

  • Dokholyan NV, Li L, Ding F, Shakhnovich EI.. Topological determinants of protein folding. Proc Natl Acad Sci USA. 2002, 99 (13): 8637–41

    Article  PubMed  CAS  Google Scholar 

  • Dopazo J. A new index to find regions showing an unexpected variability or conservation in sequence alignments. Comput Appl Biosci. 1997, 13 (3): 313–7

    PubMed  CAS  Google Scholar 

  • Dorit RL, Ayala FJ. ADH evolution and the phylogenetic footprint. J Mol Evol. 1995, 40 (6): 658–62

    Article  PubMed  CAS  Google Scholar 

  • Ferreira F, Ebner C, Kramer B, Casari G, Briza P, Kungl AJ, Grimm R, Jahn-Schmid B, Breiteneder H, Kraft D, et al. Modulation of IgE reactivity of allergens by site-directed mutagenesis: potential use of hypoallergenic variants for immunotherapy. Faseb Journal. 1998, 12: 231–242

    PubMed  CAS  Google Scholar 

  • Ferreira F, Wallner M, Breiteneder H, Hartl A, Thalhamer J, Ebner C. Genetic engineering of allergens: Future therapeutic products. International Archives of Allergy and Immunology. 2002, 128: 171–178

    Article  PubMed  CAS  Google Scholar 

  • Fetrow JS, Skolnick J. Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. J Mol Biol. 1998, 281 (5): 949–68

    Article  PubMed  CAS  Google Scholar 

  • Gaboriaud C, Rossi V, Fontecilla-Camps JC, Arland GJ. Evolutionary conserved rigid module-domain interactions can be detected at the sequence level: The examples of complement and blood coagulation proteases. Journal of Molecular Biology. 1998, 282: 459–470

    Article  PubMed  CAS  Google Scholar 

  • Garcia B, Castellanos A, Menendez J, Pons T. Molecular cloning of an alpha-glucosidaselike gene from Penicillium minioluteum and structure prediction of its gene product. Biochemical and Biophysical Research Communications. 2001, 281: 151–158

    Article  PubMed  CAS  Google Scholar 

  • Giraud, BG, Lapedes A, Liu LC. Analysis of correlation between sites in models of protein sequences. Physical Rev E. 1998, 58 (5): 6312–6322

    Article  CAS  Google Scholar 

  • Gribskov M, Homyak M, Edenfield J, Eisenberg D. Profile scanning for three-dimensional structural patterns in protein sequences. Comput Appl Biosci. 1988, 4 (1): 61–6

    PubMed  CAS  Google Scholar 

  • Grishin NV, Phillips MA. The subunit interfaces of oligomeric enzymes are conserved to a similar extent to the overall protein sequences. Protein Sci. 1994, 3 (12): 2455–8

    Article  PubMed  CAS  Google Scholar 

  • Gu JY, Wang YF, Gu X. Evolutionary analysis for functional divergence of Jak protein kinase domains and tissue-specific genes. Journal of Molecular Evolution. 2002, 54: 725–733

    Article  PubMed  CAS  Google Scholar 

  • Hannenhalli SS, Russell RB. Analysis and Prediction of Functional Sub-types from Protein Sequence Alignments. J Mol Biol. 2000, 303: 61–76

    Article  PubMed  CAS  Google Scholar 

  • Iliopoulos I, Tsoka S, Andrade MA, Janssen P, Audit B, Tramontano A, Valencia A, Leroy C, Sander C, Ouzounis C. A. Genome sequences and great expectations. Genome Biol. 2000, 2(1):INTERACTIONS0001

    Google Scholar 

  • Johnson JM, Church GM. Predicting ligand-binding function in families of bacterial receptors. Proceedings of the National Academy of Sciences of the United States of America. 2000, 97: 3965–3970

    Article  PubMed  CAS  Google Scholar 

  • Kohonen, T. Self-organized formation of topologically correct feature maps. Biol. Cybern. 1982, 43: 59–69

    Article  Google Scholar 

  • Kraft D, Ferreira F, Vrtala S, Breiteneder H, Ebner C, Valenta R, Susani M, Breitenbach M, Scheiner O. The importance of recombinant allergens for diagnosis and therapy of IgE-mediated allergies. International Archives of Allergy and Immunology 1999, 118: 171–176

    Article  PubMed  CAS  Google Scholar 

  • Kuipers W, Oliveira L, Vriend G, Ijzerman AP. Identification of class-determining residues in G protein-coupled receptors by sequence analysis. Receptors Channels. 1997, 5 (34): 159–74

    PubMed  CAS  Google Scholar 

  • Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE. A geometric approach to macromolecule-ligand interactions. J Mol Biol. 1982, 161 (2): 269–88

    Article  PubMed  CAS  Google Scholar 

  • Landgraf R, Fischer D, Eisenberg D. Analysis of heregulin symmetry by weighted evolutionary tracing. Protein Engineering. 1999, 12: 943–951

    Article  PubMed  CAS  Google Scholar 

  • Landgraf R, Xenarios I, Eisenberg D. Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. J Mol Biol. 2001, 307: 1487–1502

    Article  PubMed  CAS  Google Scholar 

  • Lichtarge O, Boume HR, Cohen FE. An Evolutionary Trace Method Defines Binding Surfaces Common to Protein Families. J Mol Biol. 1996, 257: 342–358

    Article  PubMed  CAS  Google Scholar 

  • Livingstone CD, Barton GJ. Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation. Comput Appl Biosci. 1993, 6: 645–756

    Google Scholar 

  • Lizano S, Lambeau G, Lazdunski M. Cloning and cDNA sequence analysis of Lys(49) and Asp(49) basic phospholipase A(2) myotoxin isoforms from Bothrops asper. International Journal of Biochemistry and Cell Biology. 2001, 33: 127–132

    Article  PubMed  CAS  Google Scholar 

  • Lockless, S. W., Ranganathan, R. Evolutionary conserved pathways of energetic connectivity in protein families. Science. 1999, 286: 295–299

    Article  PubMed  CAS  Google Scholar 

  • Luscombe NM, Thornton JM. Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity. J Mol Biol. 2002, 320 (5): 991–1009

    Article  PubMed  CAS  Google Scholar 

  • Madabushi S, Yao H, Marsh M, Kristensen DM, Philippi A, Sowa ME, Lichtarge O. Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol. 2002, 316: 139–154

    Article  PubMed  CAS  Google Scholar 

  • Mirny LA, Gelfand MS. Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors. Journal of Molecular Biology. 2002, 321: 7–20

    Article  PubMed  CAS  Google Scholar 

  • Miyata, T., Miyazawa, S., Yashunaga, T. Two types of amino acid substitutions in protein evolution. J. Mol. Evol. 1979, 12: 219–236

    Article  PubMed  CAS  Google Scholar 

  • Morillas M, Gomez-Puertas P, Bentebibel A, Selles E, Casals N, Valencia A, Hegardt FG, Serra D Identification of conserved amino acid residues in rat liver Carnitine palmitoyltransferase I critical for malonyl-CoA inhibition. Journal of Biological Chemistry. 2003, 278: 9058–9063

    Article  PubMed  CAS  Google Scholar 

  • Morillas M, Gomez-Puertas P, Roca R, Serra D, Asins G, Valencia A, Hegardt FG. Structural model of the catalytic core of carnitine palmitoyltransferase I and carnitine octanoyltransferase (COT)–Mutation of CPT I histidine 473 and alanine 381 and COT alanine 238 impairs the catalytic activity. Journal of Biological Chemistry. 2001, 276: 45001–45008

    Article  PubMed  CAS  Google Scholar 

  • Morillas M, Gomez-Puertas P, Rubi B, Clotet J, Arino J, Valencia A, Hegardt FG, Serra D, Asins G. Structural model of a malonyl-CoA-binding site of carnitine octanoyltransferase and carnitine palmitoyltransferase I- Mutational analysis of a malonyl-CoA affinity domain. Journal of Biological Chemistry. 2002, 277: 11473–11480

    Article  PubMed  CAS  Google Scholar 

  • Osuna J, Soberon X, Morett E. A proposed architecture for the Central domain of the bacterial enhancer-binding proteins based on secondary structure prediction and fold recognition. Protein Science. 1997, 6: 543–555

    Article  PubMed  CAS  Google Scholar 

  • Ouzounis C, Perez-Irratxeta C, Sander C, Valencia A. Are binding residues conserved? Pacific Symposium on Biocomputing. 1998, 3: 399–410

    Google Scholar 

  • Padilla-Zuniga AJ, Rojo-Dominguez A. Non-homology knowledge-based prediction of the papain prosegment folding pattern: a description of plausible folding and activation mechanisms. Folding and Design. 1998, 3: 271–284

    Article  PubMed  CAS  Google Scholar 

  • Pazos F, Sanchez-Pulido L, Garcia-Ranea JA, Andrade MA, Atrian S, Valencia A. Comparative analysis of different methods for the detection of specificity regions in protein families. In: Lundh D, Olsson, B., Narayanan A. (ed) Biocomputing and Emergent Computation. 1997, World Scientific, Singapore, New Jersey, London, Hong Kong, p 132145

    Google Scholar 

  • Pettit FK, Bowie JU. Protein surface roughness and small molecular binding sites. J Mol Biol. 1999, 285 (4): 1377–82

    Article  PubMed  CAS  Google Scholar 

  • Pons T, Olmea O, Chinea G, Beldarrain A, Marquez G, Acosta N, Rodriguez L, Valencia A. Structural model for family 32 of glycosyl-hydrolase enzymes. Proteins-Structure Function and Genetics. 1998, 33: 383–395

    Article  CAS  Google Scholar 

  • Puntoriero G, Meola A, Lahm A, Zucchelli S, Ercole BB, Tafi R, Pezzanera M, Mondelli MU, Cortese R, Tramontano A, et al. Towards a solution for hepatitis C virus hyper-variability: mimotopes of the hypervariable region 1 can induce antibodies cross-reacting with a large number of viral variants. Embo Journal. 1998, 17: 3521–3533

    Article  PubMed  CAS  Google Scholar 

  • Pupko T, Bell RE, Mayrose I, Glaser F, Ben-Tal N. Rate Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics. 2002, 18: S71 - S77

    Article  PubMed  Google Scholar 

  • Reva BA, Finkelstein AV, Skolnick J. Derivation and testing residue-residue mean-force potentials for use in protein structure recognition. Methods Mol Biol. 2000, 143: 155–74

    PubMed  CAS  Google Scholar 

  • Roccasecca R, Folgori A, Ercole BB, Puntoriero G, Lahra A, Zucchelli S, Tafi R, Pezzanera M, Galfre G, Tramontano A, et al. Mimotopes of the hyper variable region I of the hepatitis C virus induce cross-reactive antibodies directed against discontinuous epitopes. Molecular Immunology. 2001, 38: 485–492

    Article  PubMed  CAS  Google Scholar 

  • Rost B. Enzyme function less conserved than anticipated. J Mol Biol. 2002, 318: 595–608

    Article  PubMed  CAS  Google Scholar 

  • Rost B, Honig B, Valencia A. Bioinformatics in structural genomics. Bioinformatics. 2002, 18 (7): 897–8

    Article  PubMed  CAS  Google Scholar 

  • Sagara JI, Shimizu S, Kawabata T, Nakamura S, Ikeguchi M, Shimizu K. The use of sequence comparison to detect `identities’ in tRNA genes. Nucleic Acids Research. 1998, 26: 1974–1979

    Article  PubMed  CAS  Google Scholar 

  • Shannon CE, and Weaver W. The Mathematical Theory of Communication. The University of Illinois Press, Urbana, 1949

    Google Scholar 

  • Sibbald PR, Argos P. Weighting aligned protein or nucleic acid sequences to correct for unequal representation. J Mol Biol. 1990, 216 (4): 813–8

    Article  PubMed  CAS  Google Scholar 

  • Singer, M. S., Oliveira, L. Vriend, G., Shepherd, G. M. Potential ligand-binding residues in rat olfactory receptors identified by correlated mutation analysis. Receptor and Channels. 1995, 3: 89–95

    CAS  Google Scholar 

  • Süel, G.M., Lockless, S. W., Ranganathan, R. Evolutionary conserved networks of residues mediate allosteric communication in proteins. Nat. Struct. Biology. 2003, 10 (1): 59–68

    Article  Google Scholar 

  • Taylor, W. R., Harricks, K. Compensating changes in protein multiple sequence alignments. Prot. Eng. 1994, 7: 342–348

    Google Scholar 

  • Taylor, W. R. Classification of amino acid conservation. J Theor. Biol. 1986, 119: 205–218

    Article  PubMed  CAS  Google Scholar 

  • Todd AE, Orengo CA, Thornton JM. Evolution of function in protein superfamilies, from a structural perspective. J Mol Biol. 2001, 307: 1113–1143

    Article  PubMed  CAS  Google Scholar 

  • Valdar WS, Thornton JM. Protein-protein interfaces: analysis of amino acid conservation in homodimers. Proteins. 2001, 42: 108–124

    Article  PubMed  CAS  Google Scholar 

  • Valencia A, Hubbard TJ, Muga A, Banuelos S, Llorca O, Carrascosa JL, Valpuesta JM. Prediction of the Structure of Groes and Its Interaction with Groel. Proteins-Structure Function and Genetics. 1995, 22: 199–209

    Article  CAS  Google Scholar 

  • Villar HO, Kauvar LM. Amino-acid preferences at protein binding sites. FEBS Lett. 1994, 349: 125–130

    Article  PubMed  CAS  Google Scholar 

  • Wang YF, Gu X. Functional divergence in the caspase gene family and altered functional constraints: Statistical analysis and prediction. Genetics. 2001, 158: 1311–1320

    PubMed  CAS  Google Scholar 

  • Ward RJ, Alves AR, Neto JR, Arni RK, Casari G. A SequenceSpace analysis of Lys49 phospholipases A(2): clues towards identification of residues involved in a novel mechanism of membrane damage and in myotoxicity. Protein Engineering. 1998, 11: 285–294

    Article  PubMed  CAS  Google Scholar 

  • Wodak SJ, Janin J. Structural basis of macromolecular recognition. Advances in Protein Chemistry. 2003, 61: 9

    Article  CAS  Google Scholar 

  • Yao, H., Kristensen, D. M., Mihalek, I., Sowa, M. E., Shaw, C., Kimmer, M., Kavraki, L., Lichtarge, O. An accurate, sensitive, and scalable method to identify functional sites in protein structures. J. Mol. Biol. 2003, 326: 255–261

    Article  PubMed  CAS  Google Scholar 

  • Zucchelli S, Roccasecca R, Meola A, Ercole BB, Tafi R, Dubuisson J, Galfre G, Cortese R, Nicosia A. Mimotopes of the hepatitis C virus hypervariable region 1, but not the natural sequences, induce cross-reactive antibody response by genetic immunization. Hepatology. 2001, 33: 692–703

    Article  PubMed  CAS  Google Scholar 

  • Zuckerkandl E, Pauling L. Evolutionary Divergence and Convergence in Proteins. In: Bryson V, Vogel HJ (eds) Evolving Genes And Proteins. Academic Press, 1965, New York, p 97–166

    Google Scholar 

  • Zvelebil, M. J. J. M., Barton, G. J., Taylor, W. R., Stenberg, M. J. E. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 1987, 195: 957–961

    Article  PubMed  CAS  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

López-Romero, P., Gómez, M.J., Gómez-Puertas, P., Valencia, A. (2004). Prediction of Functional Sites in Proteins by Evolutionary Methods. In: Kamp, R.M., Calvete, J.J., Choli-Papadopoulou, T. (eds) Methods in Proteome and Protein Analysis. Principles and Practice. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-08722-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-08722-0_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05779-3

  • Online ISBN: 978-3-662-08722-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics