A High Performing Tool for Residue Solvent Accessibility Prediction

  • Lorenzo Palmieri
  • Maria Federico
  • Mauro Leoncini
  • Manuela Montangero
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6865)


Many efforts were spent in the last years in bridging the gap between the huge number of sequenced proteins and the relatively few solved structures. Relative Solvent Accessibility (RSA) prediction of residues in protein complexes is a key step towards secondary structure and protein-protein interaction sites prediction. With very different approaches, a number of software tools for RSA prediction have been produced throughout the last twenty years. Here, we present a binary classifier which implements a new method mainly based on sequence homology and implemented by means of look-up tables. The tool exploits residue similarity in solvent exposure pattern of neighboring context in similar protein chains, using BLAST search and DSSP structure. A two-state classification with 89.5% accuracy and 0.79 correlation coefficient against the real data is achieved on a widely used dataset.


Support Vector Regression Query Sequence Solvent Accessibility Accessible Surface Area Similarity Depth 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jones, S., Thornton, J.M.: Analysis of Protein-Protein Interaction Sites Using Surface Patches. J. Mol. Biol. 272, 132–143 (1997)Google Scholar
  2. 2.
    Wako, H., Blundell, T.L.: Use of Amino Acid Environment-Dependent Substitution Tables and Conformational Propensities in Structure Prediction from Aligned Sequences of Homologous Proteins. I. Solvent accessibility classes. J. Mol. Biol. 238, 682–692 (1994)CrossRefGoogle Scholar
  3. 3.
    Chakrabarti, P., Janin, J.: Dissecting Protein-Protein Recognition Sites. Proteins 47, 334–343 (2002)CrossRefGoogle Scholar
  4. 4.
    Rost, B., Sander, C.: Conservation and Prediction of Solvent Accessibility in Protein Families. Proteins 20, 216–226 (1994)CrossRefGoogle Scholar
  5. 5.
    Carugo, O.: Predicting Residue Solvent Accessibility From Protein Sequence by Considering the Sequence Environment. Protein Eng. 13, 607–609 (2000)CrossRefGoogle Scholar
  6. 6.
    Naderi-Manesh, H., Sadeghi, M., Arab, S., Moosavi Movahedi, A.A.: Prediction of Protein Surface Accessibility with Information Theory. Proteins 42, 452–459 (2001)CrossRefGoogle Scholar
  7. 7.
    Thompson, M.J., Goldstein, R.A.: Predicting Solvent Accessibility: Higher Accuracy Using Bayesian Statistics and Optimized Residue Substitution Classes. Proteins 25, 38–47 (1996)CrossRefGoogle Scholar
  8. 8.
    Gianese, G., Bossa, F., Pascarella, S.: Improvement in Prediction of Solvent Accessibility by Probability Profiles. Protein Eng. 16, 987–992 (2003)CrossRefGoogle Scholar
  9. 9.
    Holbrook, S.R., Muskal, S.M., Kim, S.H.: Predicting Surface Exposure of Amino Acids from Protein Sequences. Protein Eng. 3, 659–665 (1990)CrossRefGoogle Scholar
  10. 10.
    Rost, B., Sander, C.: Combining Evolutionary Information and Neural Networks to Predict Protein Secondary Structure. Proteins 19, 55–72 (1994)CrossRefGoogle Scholar
  11. 11.
    Ahmad, S., Gromiha, M.M.: NETASA: Neural Network Based Prediction of Solvent Accessibility. Bioinformatics 18, 819–824 (2002)CrossRefGoogle Scholar
  12. 12.
    Pollastri, G., Baldi, P., Fariselli, P., Casadio, R.: Prediction of Coordination Number and Relative Solvent Accessibility in Proteins. Proteins 47, 142–153 (2002)CrossRefGoogle Scholar
  13. 13.
    Adamczak, R., Porollo, A., Meller, J.: Accurate Prediction of Solvent Accessibility Using Neural Networks Based Regression. Proteins 56, 753–767 (2004)CrossRefGoogle Scholar
  14. 14.
    Garg, A., Kaur, H., Raghava, G.P.S.: Real Value Prediction of Solvent Accessibility in Proteins Using Multiple Sequence Alignment and Secondary Structure. Proteins 61, 318–324 (2005)CrossRefGoogle Scholar
  15. 15.
    Dor, O., Zhou, Y.: Real-SPINE: An Integrated System of Neural Networks for Real-value Prediction of Protein Structural Properties. Proteins 68, 76–81 (2007)CrossRefGoogle Scholar
  16. 16.
    Li, X., Pan, X.M.: New Method for Accurate Prediction of Aolvent Accessibility from Protein Sequence. Proteins 42, 1–5 (2001)CrossRefGoogle Scholar
  17. 17.
    Wang, J., Lee, H., Ahmad, S.: Prediction and Evolutionary Information Analysis of Protein Solvent Accessibility Using Multiple Linear Regression. Proteins 61, 481–491 (2005)CrossRefGoogle Scholar
  18. 18.
    Yuan, Z., Burrage, K., Mattick, J.S.: Prediction of Protein Solvent Accessibility Using Support Vector Machines. Proteins 48, 566–570 (2002)CrossRefGoogle Scholar
  19. 19.
    Nguyen, M., Rajapakse, J.: Prediction of Protein Relative Solvent Accessibility with a two-stage SVM Approach. Proteins 59, 30–37 (2005)CrossRefGoogle Scholar
  20. 20.
    Meshkin, A., Ghafuri, H.: Prediction of Relative Solvent Accesibility by Support Vector Regression and Best-First Method. EXCLI Journal 9, 29–38 (2010)Google Scholar
  21. 21.
    Wang, J.-Y., Ahmad, S., Gromiha, M.M., Sarai, A.: Look-up Tables for Protein Solvent Accessibility Prediction and Nearest Neighbor Effect Analysis. Biopolymers 75, 209–216 (2004)CrossRefGoogle Scholar
  22. 22.
    Chen, H., Zhou, H.X.: Prediction of Solvent Accessibility and Sites of Deleterious Mutations from Protein Sequence. Nucleic Acids Res. 33, 3193–3199 (2005)CrossRefGoogle Scholar
  23. 23.
    Chen, K., Kurgan, M., Kurgan, L.: Sequence Based Prediction of Relative Solvent Accessibility Using two-stage Support Vector Regression with Confidence Values. J. Biomed. Sci. Eng. 1, 1–9 (2008)CrossRefGoogle Scholar
  24. 24.
    Flores, T.P., Orengo, C.A., Moss, D.S., Thornton, J.M.: Comparison of Conformational Characteristics in Structurally Similar Protein Pairs. Protein Sci. 2, 1811–1826 (1993)CrossRefGoogle Scholar
  25. 25.
    Cuff, J.A., Barton, G.J.: Application of Multiple Sequence Alignments Profiles to Improve Protein Secondary Structure Prediction. Proteins 40, 502–511 (2000)CrossRefGoogle Scholar
  26. 26.
    Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000)CrossRefGoogle Scholar
  27. 27.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic Local Alignment Search Tool. J. Mol. Biol. 215, 403–410 (1990)CrossRefGoogle Scholar
  28. 28.
    Kabsch, W., Sander, C.: Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 22, 2577–2637 (1983)CrossRefGoogle Scholar
  29. 29.
    Chothia, C.: The Nature of the Accessible and Buried Surfaces in Proteins. J. Mol. Biol. 105, 1–12 (1976)CrossRefGoogle Scholar
  30. 30.
    Carugo, O.: Prediction of Polypeptide Fragments Exposed to the Solvent. Silico Biology 3, 35 (2003)Google Scholar
  31. 31.
    Palmieri, L., Federico, M., Leoncini, M., Montangero, M.: Sequence-Based Prediction of Solvent Accessibility in Proteins. University of Modena and Reggio Emilia, M2CSC doctoral research school, internal report (2009)Google Scholar
  32. 32.
    Rose, G.D., Geselowitz, A.R., Lesser, G.J., Lee, R.H., Zehfus, M.H.: Hydrophobicity of Amino Acid Residues in Globular Proteins. Science 229, 834–838 (1985)CrossRefGoogle Scholar
  33. 33.
    Ahmad, S., Gromiha, M.M., Sarai, A.: Real Value Prediction of Solvent Accessibility from Amino Acid Sequence. Proteins 50, 629–635 (2003)CrossRefGoogle Scholar
  34. 34.
    Brenner, S.E., Chothia, C., Hubbard, T.J.P.: PNAS 95, 6073–6078 (1998)Google Scholar
  35. 35.
    Blaber, M., Lindstrom, J.D., Gassner, N., Xu, J., Heinz, D.W., Matthews, B.W.: Energetic Cost and Structural Consequences of Burying a Hydroxyl Group within the Core of a Protein Determined from Ala–>Ser and Val–>Thr Substitutions in T4 lysozyme. Biochemistry 32, 11363–11373 (1993)CrossRefGoogle Scholar
  36. 36.
    Chen, Z.G., Stauffacher, C., Li, Y., Schmidt, T., Bomu, W., Kamer, G., Shanks, M., Lomonossoff, G., Johnson, J.E.: Protein-RNA Interactions in an Icosahedral Virus at 3.0 A Resolution. Science 245, 154–159 (1998)CrossRefGoogle Scholar
  37. 37.
    Sironi, L., Mapelli, M., Knapp, S., Antoni, A., Jeang, K.T., Musacchio, A.: Crystal Structure of the Tetrameric Mad1-Mad2 Core Complex: Implications of a ’Safety Belt’ Binding Mechanism for the Spindle Checkpoint. Embo. J. 21, 2496 (2002)CrossRefGoogle Scholar
  38. 38.
    Ficko-Blean, E., Gregg, K.J., Adams, J.J., Hehemann, J.H., Smith, S.J., Czjzek, M., Boraston, A.B.: Portrait of an Enzyme, a Complete Structural Analysis of a Multimodular beta-N-acetylglucosaminidase from Clostridium Perfringens. J. Biol. Chem. 284, 9876–9884 (2009)CrossRefGoogle Scholar
  39. 39.
    Rao, F.V., Dorfmueller, H.C., Villa, F., Allwood, M., Eggleston, I.M., Van Aalten, D.M.F.: Structural Insights into the Mechanism and Inhibition of Eukaryotic O-GlcNAc Hydrolysis. Embo. J. 25, 1569 (2006)CrossRefGoogle Scholar
  40. 40.
    Gibson, R.P., Turkenburg, J.P., Charnock, S.J., Lloyd, R., Davies, G.J.: Insights into Trehalose Synthesis Provided by the Structure of the Retaining Glucosyltransferase OtsA. Chem. Biol. 9, 1337 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Lorenzo Palmieri
    • 1
  • Maria Federico
    • 1
  • Mauro Leoncini
    • 1
    • 2
  • Manuela Montangero
    • 1
    • 2
  1. 1.Dipartimento di Ingegneria dell’InformazioneUniversità di Modena e Reggio EmiliaItaly
  2. 2.CNRIstituto di Informatica e TelematicaPisaItaly

Personalised recommendations