Abstract
Functional sites are well-defined regions that are relevant for protein function, and that include characteristic groups of amino acids. These regions may be involved in the interaction between proteins and other molecules, such as other proteins, nucleic acids, small ligands and substrates. Interaction sites have been studied in great detail in representative protein families, and their relationship with natural substrates and drugs has been characterized, as well as their mediation in protein complex formation. In many cases they have been studied in relation to their potential for engineering protein activity. Protein binding sites have also been studied at a more general level by characterizing the typical structure of binding sites, and their general residue preferences. However, it is the relationship between the conservation of sequence features and protein active sites and binding sites that constitutes the basis of the development of prediction methods. The conservation of the chemical characteristics of the amino acids in specific groups of sequences, in the context of large protein families, is a particular method used in a growing collection of methods aimed at predicting protein binding sites at a genomic scale. In this review we analyze these methods, discuss their similarities, and describe a number of key unsolved problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aloy P, Querol E, Aviles FX, Sternberg MJ. Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J Mol Biol. 2001, 311 (2): 395–408
Altschuh D, Lesk AM, Bloomer AC, Klug A. Correlation of coordinated amino acid substitutions with function in virus related to tobacco mosaic virus. J. Mol. Biol. 1987, 193: 693–707
Andrade MA, Casari G, Sander C, Valencia A. Classification of protein families and detection of the determinant residues with an improved self-organizing map. Biol Cybern. 1997, 76: 441–450
Armon A, Graur D, Ben-Tal N. Con Surf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J Mol Biol. 2001, 307: 447–463
Atchley, W. R., Terhalle, W., Dress, A. Positional dependence, cliques and predictive motifs in the bHLH protein domain. J. Mol. Evol. 1999, 48: 501–516
Atchley, W. R., Wollenberg, K. R., Fitch, W. M., Terhalle, W. Correlations among amino acid sites in bHLH protein domains: an information theoretic analysis. Mol. Biol. Evol. 2000, 17: 164–178
Azuma Y, Renault L, Garcia-Ranea JA, Valencia A, Nishimoto T, Wittinghofer A. Model of the Ran-RCC1 interaction using biochemical and docking experiments. Journal of Molecular Biology. 1999, 289: 1119–1130
Bauer B, Mirey G, Vetter IR, Garcia-Ranea JA, Valencia A, Wittinghofer A, Camonis JH, Cool RH. Effector recognition by the small GTP-binding proteins Ras and Ral. Journal of Biological Chemistry. 1999, 274: 17763–17770
Bazan JF, KochNolte F. Sequence and structural links between distant ADP- ribosyltransferase families. In Adp-Ribosylation in Animal Tissues. Edited by; 1997: 99–107.
Bazan JF. Helical fold prediction for the cyclin box. Proteins-Structure Function and Genetics. 1996, 24: 1–17
Blomberg N, Nilges M. Functional diversity of PH domains: an exhaustive modelling study. Folding and Design. 1997, 2: 343–355
Chap. 22 Prediction and Functional Sites in Proteins by Evolutionary Methods 337
Bogan AA, Thorn KS. Anatomy of hot spots in protein interfaces. J Mol Biol. 1998, 280 (1): 1–9
Casari G, Sander, C., Valencia, A. A method to predict functional residues in proteins. Nature Struct Biol. 1995, 2: 171–178
Clackson T, Wells JA. A hot spot of binding energy in a hormone-receptor interface. Science. 1995, 267 (5196): 383–6
Clarke, N. D. Covariation of residues in the homeodomain sequence family. Protein Sci. 1995, 4: 2269–2278
del Porto P, Puntoriero G, Scotta C, Nicosia A, Piccolella E. High prevalence of hypervariable region 1-specific and cross-reactive CD4(+) T cells in HCV-infected individuals responsive to IFN-alpha treatment. Virology. 2000, 269: 313–324
del Sol, A., Pazos, F., Valencia, A. Automatic methods for predicting functionally important residues. J. Mol. Biol. 2003, 326: 1289–1302
de Rinaldis M, Ausiello G, Cesareni G, Helmer-Citterich M. Three-dimensional profiles: a new tool to identify protein surface similarities. J Mol Biol. 1998, 284: 1211–1221
Devos D, Valencia A. Practical limits of function prediction. Proteins. 2000, 41: 98–107
Dokholyan NV, Li L, Ding F, Shakhnovich EI.. Topological determinants of protein folding. Proc Natl Acad Sci USA. 2002, 99 (13): 8637–41
Dopazo J. A new index to find regions showing an unexpected variability or conservation in sequence alignments. Comput Appl Biosci. 1997, 13 (3): 313–7
Dorit RL, Ayala FJ. ADH evolution and the phylogenetic footprint. J Mol Evol. 1995, 40 (6): 658–62
Ferreira F, Ebner C, Kramer B, Casari G, Briza P, Kungl AJ, Grimm R, Jahn-Schmid B, Breiteneder H, Kraft D, et al. Modulation of IgE reactivity of allergens by site-directed mutagenesis: potential use of hypoallergenic variants for immunotherapy. Faseb Journal. 1998, 12: 231–242
Ferreira F, Wallner M, Breiteneder H, Hartl A, Thalhamer J, Ebner C. Genetic engineering of allergens: Future therapeutic products. International Archives of Allergy and Immunology. 2002, 128: 171–178
Fetrow JS, Skolnick J. Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. J Mol Biol. 1998, 281 (5): 949–68
Gaboriaud C, Rossi V, Fontecilla-Camps JC, Arland GJ. Evolutionary conserved rigid module-domain interactions can be detected at the sequence level: The examples of complement and blood coagulation proteases. Journal of Molecular Biology. 1998, 282: 459–470
Garcia B, Castellanos A, Menendez J, Pons T. Molecular cloning of an alpha-glucosidaselike gene from Penicillium minioluteum and structure prediction of its gene product. Biochemical and Biophysical Research Communications. 2001, 281: 151–158
Giraud, BG, Lapedes A, Liu LC. Analysis of correlation between sites in models of protein sequences. Physical Rev E. 1998, 58 (5): 6312–6322
Gribskov M, Homyak M, Edenfield J, Eisenberg D. Profile scanning for three-dimensional structural patterns in protein sequences. Comput Appl Biosci. 1988, 4 (1): 61–6
Grishin NV, Phillips MA. The subunit interfaces of oligomeric enzymes are conserved to a similar extent to the overall protein sequences. Protein Sci. 1994, 3 (12): 2455–8
Gu JY, Wang YF, Gu X. Evolutionary analysis for functional divergence of Jak protein kinase domains and tissue-specific genes. Journal of Molecular Evolution. 2002, 54: 725–733
Hannenhalli SS, Russell RB. Analysis and Prediction of Functional Sub-types from Protein Sequence Alignments. J Mol Biol. 2000, 303: 61–76
Iliopoulos I, Tsoka S, Andrade MA, Janssen P, Audit B, Tramontano A, Valencia A, Leroy C, Sander C, Ouzounis C. A. Genome sequences and great expectations. Genome Biol. 2000, 2(1):INTERACTIONS0001
Johnson JM, Church GM. Predicting ligand-binding function in families of bacterial receptors. Proceedings of the National Academy of Sciences of the United States of America. 2000, 97: 3965–3970
Kohonen, T. Self-organized formation of topologically correct feature maps. Biol. Cybern. 1982, 43: 59–69
Kraft D, Ferreira F, Vrtala S, Breiteneder H, Ebner C, Valenta R, Susani M, Breitenbach M, Scheiner O. The importance of recombinant allergens for diagnosis and therapy of IgE-mediated allergies. International Archives of Allergy and Immunology 1999, 118: 171–176
Kuipers W, Oliveira L, Vriend G, Ijzerman AP. Identification of class-determining residues in G protein-coupled receptors by sequence analysis. Receptors Channels. 1997, 5 (34): 159–74
Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE. A geometric approach to macromolecule-ligand interactions. J Mol Biol. 1982, 161 (2): 269–88
Landgraf R, Fischer D, Eisenberg D. Analysis of heregulin symmetry by weighted evolutionary tracing. Protein Engineering. 1999, 12: 943–951
Landgraf R, Xenarios I, Eisenberg D. Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. J Mol Biol. 2001, 307: 1487–1502
Lichtarge O, Boume HR, Cohen FE. An Evolutionary Trace Method Defines Binding Surfaces Common to Protein Families. J Mol Biol. 1996, 257: 342–358
Livingstone CD, Barton GJ. Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation. Comput Appl Biosci. 1993, 6: 645–756
Lizano S, Lambeau G, Lazdunski M. Cloning and cDNA sequence analysis of Lys(49) and Asp(49) basic phospholipase A(2) myotoxin isoforms from Bothrops asper. International Journal of Biochemistry and Cell Biology. 2001, 33: 127–132
Lockless, S. W., Ranganathan, R. Evolutionary conserved pathways of energetic connectivity in protein families. Science. 1999, 286: 295–299
Luscombe NM, Thornton JM. Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity. J Mol Biol. 2002, 320 (5): 991–1009
Madabushi S, Yao H, Marsh M, Kristensen DM, Philippi A, Sowa ME, Lichtarge O. Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol. 2002, 316: 139–154
Mirny LA, Gelfand MS. Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors. Journal of Molecular Biology. 2002, 321: 7–20
Miyata, T., Miyazawa, S., Yashunaga, T. Two types of amino acid substitutions in protein evolution. J. Mol. Evol. 1979, 12: 219–236
Morillas M, Gomez-Puertas P, Bentebibel A, Selles E, Casals N, Valencia A, Hegardt FG, Serra D Identification of conserved amino acid residues in rat liver Carnitine palmitoyltransferase I critical for malonyl-CoA inhibition. Journal of Biological Chemistry. 2003, 278: 9058–9063
Morillas M, Gomez-Puertas P, Roca R, Serra D, Asins G, Valencia A, Hegardt FG. Structural model of the catalytic core of carnitine palmitoyltransferase I and carnitine octanoyltransferase (COT)–Mutation of CPT I histidine 473 and alanine 381 and COT alanine 238 impairs the catalytic activity. Journal of Biological Chemistry. 2001, 276: 45001–45008
Morillas M, Gomez-Puertas P, Rubi B, Clotet J, Arino J, Valencia A, Hegardt FG, Serra D, Asins G. Structural model of a malonyl-CoA-binding site of carnitine octanoyltransferase and carnitine palmitoyltransferase I- Mutational analysis of a malonyl-CoA affinity domain. Journal of Biological Chemistry. 2002, 277: 11473–11480
Osuna J, Soberon X, Morett E. A proposed architecture for the Central domain of the bacterial enhancer-binding proteins based on secondary structure prediction and fold recognition. Protein Science. 1997, 6: 543–555
Ouzounis C, Perez-Irratxeta C, Sander C, Valencia A. Are binding residues conserved? Pacific Symposium on Biocomputing. 1998, 3: 399–410
Padilla-Zuniga AJ, Rojo-Dominguez A. Non-homology knowledge-based prediction of the papain prosegment folding pattern: a description of plausible folding and activation mechanisms. Folding and Design. 1998, 3: 271–284
Pazos F, Sanchez-Pulido L, Garcia-Ranea JA, Andrade MA, Atrian S, Valencia A. Comparative analysis of different methods for the detection of specificity regions in protein families. In: Lundh D, Olsson, B., Narayanan A. (ed) Biocomputing and Emergent Computation. 1997, World Scientific, Singapore, New Jersey, London, Hong Kong, p 132145
Pettit FK, Bowie JU. Protein surface roughness and small molecular binding sites. J Mol Biol. 1999, 285 (4): 1377–82
Pons T, Olmea O, Chinea G, Beldarrain A, Marquez G, Acosta N, Rodriguez L, Valencia A. Structural model for family 32 of glycosyl-hydrolase enzymes. Proteins-Structure Function and Genetics. 1998, 33: 383–395
Puntoriero G, Meola A, Lahm A, Zucchelli S, Ercole BB, Tafi R, Pezzanera M, Mondelli MU, Cortese R, Tramontano A, et al. Towards a solution for hepatitis C virus hyper-variability: mimotopes of the hypervariable region 1 can induce antibodies cross-reacting with a large number of viral variants. Embo Journal. 1998, 17: 3521–3533
Pupko T, Bell RE, Mayrose I, Glaser F, Ben-Tal N. Rate Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics. 2002, 18: S71 - S77
Reva BA, Finkelstein AV, Skolnick J. Derivation and testing residue-residue mean-force potentials for use in protein structure recognition. Methods Mol Biol. 2000, 143: 155–74
Roccasecca R, Folgori A, Ercole BB, Puntoriero G, Lahra A, Zucchelli S, Tafi R, Pezzanera M, Galfre G, Tramontano A, et al. Mimotopes of the hyper variable region I of the hepatitis C virus induce cross-reactive antibodies directed against discontinuous epitopes. Molecular Immunology. 2001, 38: 485–492
Rost B. Enzyme function less conserved than anticipated. J Mol Biol. 2002, 318: 595–608
Rost B, Honig B, Valencia A. Bioinformatics in structural genomics. Bioinformatics. 2002, 18 (7): 897–8
Sagara JI, Shimizu S, Kawabata T, Nakamura S, Ikeguchi M, Shimizu K. The use of sequence comparison to detect `identities’ in tRNA genes. Nucleic Acids Research. 1998, 26: 1974–1979
Shannon CE, and Weaver W. The Mathematical Theory of Communication. The University of Illinois Press, Urbana, 1949
Sibbald PR, Argos P. Weighting aligned protein or nucleic acid sequences to correct for unequal representation. J Mol Biol. 1990, 216 (4): 813–8
Singer, M. S., Oliveira, L. Vriend, G., Shepherd, G. M. Potential ligand-binding residues in rat olfactory receptors identified by correlated mutation analysis. Receptor and Channels. 1995, 3: 89–95
Süel, G.M., Lockless, S. W., Ranganathan, R. Evolutionary conserved networks of residues mediate allosteric communication in proteins. Nat. Struct. Biology. 2003, 10 (1): 59–68
Taylor, W. R., Harricks, K. Compensating changes in protein multiple sequence alignments. Prot. Eng. 1994, 7: 342–348
Taylor, W. R. Classification of amino acid conservation. J Theor. Biol. 1986, 119: 205–218
Todd AE, Orengo CA, Thornton JM. Evolution of function in protein superfamilies, from a structural perspective. J Mol Biol. 2001, 307: 1113–1143
Valdar WS, Thornton JM. Protein-protein interfaces: analysis of amino acid conservation in homodimers. Proteins. 2001, 42: 108–124
Valencia A, Hubbard TJ, Muga A, Banuelos S, Llorca O, Carrascosa JL, Valpuesta JM. Prediction of the Structure of Groes and Its Interaction with Groel. Proteins-Structure Function and Genetics. 1995, 22: 199–209
Villar HO, Kauvar LM. Amino-acid preferences at protein binding sites. FEBS Lett. 1994, 349: 125–130
Wang YF, Gu X. Functional divergence in the caspase gene family and altered functional constraints: Statistical analysis and prediction. Genetics. 2001, 158: 1311–1320
Ward RJ, Alves AR, Neto JR, Arni RK, Casari G. A SequenceSpace analysis of Lys49 phospholipases A(2): clues towards identification of residues involved in a novel mechanism of membrane damage and in myotoxicity. Protein Engineering. 1998, 11: 285–294
Wodak SJ, Janin J. Structural basis of macromolecular recognition. Advances in Protein Chemistry. 2003, 61: 9
Yao, H., Kristensen, D. M., Mihalek, I., Sowa, M. E., Shaw, C., Kimmer, M., Kavraki, L., Lichtarge, O. An accurate, sensitive, and scalable method to identify functional sites in protein structures. J. Mol. Biol. 2003, 326: 255–261
Zucchelli S, Roccasecca R, Meola A, Ercole BB, Tafi R, Dubuisson J, Galfre G, Cortese R, Nicosia A. Mimotopes of the hepatitis C virus hypervariable region 1, but not the natural sequences, induce cross-reactive antibody response by genetic immunization. Hepatology. 2001, 33: 692–703
Zuckerkandl E, Pauling L. Evolutionary Divergence and Convergence in Proteins. In: Bryson V, Vogel HJ (eds) Evolving Genes And Proteins. Academic Press, 1965, New York, p 97–166
Zvelebil, M. J. J. M., Barton, G. J., Taylor, W. R., Stenberg, M. J. E. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 1987, 195: 957–961
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
López-Romero, P., Gómez, M.J., Gómez-Puertas, P., Valencia, A. (2004). Prediction of Functional Sites in Proteins by Evolutionary Methods. In: Kamp, R.M., Calvete, J.J., Choli-Papadopoulou, T. (eds) Methods in Proteome and Protein Analysis. Principles and Practice. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-08722-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-662-08722-0_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05779-3
Online ISBN: 978-3-662-08722-0
eBook Packages: Springer Book Archive