Abstract
Identifying hotspots responsible for protein interactions with other macromolecules or drugs provides insight into functional aspects of the protein network, and is a pivotal task in systems biology and drug discovery. Here, we present the protocol for the application of a machine-learning method – Random Forest – to prediction of interacting residues in proteins, based on either the structural parameters or the primary sequence alone.
Key words
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Arkin MR, and Wells Ja (2004) Small-molecule inhibitors of protein-protein interactions: progressing towards the dream., Nature reviews. Drug discovery 3, 301–317.
Fry DC (2006) Protein – Protein Interactions as Targets for Small Molecule Drug Discovery, Biopolymers 84, 535–552.
Chakrabarti P (2002) Dissecting Protein – Protein Recognition Sites, Biochimie 343, 334–343.
Clackson T, and Wells JA (1995) A hot spot of binding energy in a hormone-receptor interface, Science 267, 383–386.
Bogan aa, and Thorn KS (1998) Anatomy of hot spots in protein interfaces., Journal of Molecular Biology 280, 1–9.
Fischer TB (2003) The binding interface database (BID): a compilation of amino acid hot spots in protein interfaces, Bioinformatics 19, 1453–1454.
Thorn KS BA (2001) ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions., Bioinformatics 3, 284–285.
Ezkurdia I, Bartoli L, Fariselli P, Casadio R, Valencia A, and Tress ML (2009) Progress and challenges in predicting protein-protein interaction sites., Briefings in bioinformatics 10, 233–246.
de Vries SJ, and Bonvin AMJJ (2008) How Proteins Get in Touch: Interface Prediction in the Study of Biomolecular Complexes, Current Protein and Peptide Science 9, 394–406.
Tuncbag N, Kar G, Keskin O, Gursoy A, and Nussinov R (2009) A survey of available tools and web servers for analysis of protein-protein interactions and interfaces., Briefings in bioinformatics 10, 217–232.
Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S, Fagan P, Marvin J, Padilla D, Ravichandran V, Schneider B, Thanki N, Weissig H, Westbrook JD, and Zardecki C (2002) The Protein Data Bank, Acta Crystallogr D Biol Crystallogr 58, 899–907.
Xu Q, Canutescu A, Obradovic Z, and Dunbrack RL (2006) ProtBuD: a database of biological unit structures of protein families and superfamilies., Bioinformatics (Oxford, England) 22, 2876–2882.
Henrick K, and Thornton JM (1998) PQS: a protein quaternary structure file server, Trends Biochem Sci 23, 358–361.
Levy ED, Pereira-Leal JB, Chothia C, and Teichmann Sa (2006) 3D complex: a structural classification of protein complexes., PLoS computational biology 2, e155.
Jefferson ER, Walsh TP, and Barton GJ (2006) Biological units and their effect upon the properties and prediction of protein-protein interactions., Journal of Molecular Biology 364, 1118–1129.
Levy ED (2007) PiQSi: protein quaternary structure investigation., Structure (London, England : 1993) 15, 1364–1367.
Noguchi T (2003) PDB-REPRDB: a database of representative protein chains from the Protein Data Bank (PDB) in 2003, Nucleic acids research 31, 492–493.
Wang G (2003) PISCES: a protein sequence culling server, Bioinformatics 19, 1589–1591.
Wang G, and Dunbrack RL (2005) PISCES: recent improvements to a PDB sequence culling server., Nucleic acids research 33, W94–98.
Joosten RP, Te Beek TaH, Krieger E, Hekkelman ML, Hooft RWW, Schneider R, Sander C, and Vriend G (2010) A series of PDB related databases for everyday needs., Nucleic acids research, 1–9.
Mihel J, Šikić M, Tomić S, Jeren B, and Vlahoviček K (2008) PSAIA - protein structure and interaction analyzer., BMC structural biology 8, 21.
Rost B (2001) Review: protein secondary structure prediction continues to rise, J Struct Biol 134, 204–218.
Neshich G, Mazoni I, Oliveira SRM, Yamagishi MEB, Kuser-Fal1 ao PR, Borro LC, Morita DU, Souza KRR, Almeida GV, Rodrigues DN, Jardine JG, Togawa RC, Mancini aL, Higa RH, Cruz SaB, Vieira FD, Santos EH, Melo RC, and Santoro MM (2006) The Star STING server: a multiplatform environment for protein structure analysis., Genetics and molecular research : GMR 5, 717–722.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, and Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res 25, 3389–3402.
Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, and Kanehisa M (2008) AAindex: amino acid index database, progress report 2008., Nucleic acids research 36, D202–205.
Breiman L (2001) Random forests, Mach Learn 45, 5–32.
Šikić M, Tomić S, and Vlahoviček K (2009) Prediction of protein-protein interaction sites in sequences and 3D structures by random forests., PLoS computational biology 5, e1000278.
Hall M, Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. H. (2009) The WEKA data mining software: an update, ACM SIGKDD Explorations Newsletter.
Liaw A, and Wiener M (2002) Classification and Regression by randomForest, Glass 2, 18–22.
Demsar J, Zupan, B. (2004) Orange: From Experimental Machine Learning to Interactive Data Mining, Faculty of Computer and Information Science, University of Ljubljana.
Mierswa I, Wurst, M., Klinkenberg, R., Scholz, M.,Euler, T (2006) YALE: Rapid Prototyping for Complex Data Mining Tasks, KDD 06 Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 935–940.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Franke, V., Šikić, M., Vlahoviček, K. (2012). Prediction of Interacting Protein Residues Using Sequence and Structure Data. In: Baron, R. (eds) Computational Drug Discovery and Design. Methods in Molecular Biology, vol 819. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-465-0_16
Download citation
DOI: https://doi.org/10.1007/978-1-61779-465-0_16
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-61779-464-3
Online ISBN: 978-1-61779-465-0
eBook Packages: Springer Protocols