A collaborative environment for developing and validating predictive tools for protein biophysical characteristics

  • 184 Accesses

  • 1 Citations


The exchange of information between experimentalists and theoreticians is crucial to improving the predictive ability of theoretical methods and hence our understanding of the related biology. However many barriers exist which prevent the flow of information between the two disciplines. Enabling effective collaboration requires that experimentalists can easily apply computational tools to their data, share their data with theoreticians, and that both the experimental data and computational results are accessible to the wider community. We present a prototype collaborative environment for developing and validating predictive tools for protein biophysical characteristics. The environment is built on two central components; a new python-based integration module which allows theoreticians to provide and manage remote access to their programs; and PEATDB, a program for storing and sharing experimental data from protein biophysical characterisation studies. We demonstrate our approach by integrating PEATSA, a web-based service for predicting changes in protein biophysical characteristics, into PEATDB. Furthermore, we illustrate how the resulting environment aids method development using the Potapov dataset of experimentally measured ΔΔGfold values, previously employed to validate and train protein stability prediction algorithms.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3


  1. 1.


  2. 2.

    See and


  1. 1.

    Berman H, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10(12):980

  2. 2.

    Porter CT, Bartlett GJ, Thornton JM (2004) The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res 32(Database issue):D129–D133

  3. 3.

    Kumar MDS, Bava KA, Gromiha MM, Prabakaran P, Kitajima K, Uedaira H, Sarai A (2006) ProTherm and ProNIT: thermodynamic databases for proteins and protein-nucleic acid interactions. Nucleic Acids Res 34(Database issue):D204–D206

  4. 4.

    Barthelmes J, Ebeling C, Chang A, Schomburg I, Schomburg D (2007) BRENDA, AMENDA and FRENDA: the enzyme information system in 2007. Nucleic Acids Res 35(Database issue):D511–D514

  5. 5.

    Toseland CP, McSparron H, Davies MN, Flower DR (2006) PPD v1.0–an integrated, web-accessible database of experimentally determined protein pKa values. Nucleic Acids Res 34(Database issue):D199–D203

  6. 6.

    Farrell D, Miranda ES, Webb H, Georgi N, Crowley PB, McIntosh LP, Nielsen JE (2010) Titration_DB: storage and analysis of NMR-monitored protein pH titration curves. Proteins 78(4):843–857

  7. 7.

    Block P, Sotriffer CA, Dramburg I, Klebe G (2006) AffinDB: a freely accessible database of affinities for protein-ligand complexes from the PDB. Nucleic Acids Res 34(Database issue):D522–D526

  8. 8.

    Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35(Database issue):D198–D201

  9. 9.

    Wang R, Fang X, Lu Y, Yang C-Y, Wang S (2005) The PDBbind database: methodologies and updates. J Med Chem 48(12):4111–4119

  10. 10.

    Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C (2009) STRING 8–a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res 37(Database issue):D412–D416

  11. 11.

    Rohl CA, Strauss CEM, Misura KMS, Baker D (2004) Protein Structure Prediction Using Rosetta. Methods Enzymol 383:66–93

  12. 12.

    Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9:40

  13. 13.

    Sham YY, Chu ZT, Tao H, Warshel A (2000) Examining methods for calculations of binding free energies: LRA, LIE, PDLD-LRA, and PDLD/S-LRA calculations of ligands binding to an HIV protease. Proteins 39(4):393–407

  14. 14.

    Wang R, Lu Y, Wang S (2003) Comparative evaluation of 11 scoring functions for molecular docking. J Med Chem 46(12):2287–2303

  15. 15.

    Li H, Robertson AD, Jensen JH (2005) Very fast empirical prediction and rationalization of protein pKa values. Proteins 61(4):704–721

  16. 16.

    Tynan-Connolly BM, Nielsen JE (2006) pKD: re-designing protein pKa values. Nucleic Acids Res 34(Web Server issue):W48–W51

  17. 17.

    Korkegian A, Black ME, Baker D, Stoddard BL (2005) Computational thermostabilization of an enzyme. Science 308(5723):857–860

  18. 18.

    Potapov V, Cohen M, Schreiber G (2009) Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details. Protein Eng Des Sel 22(9):553–560

  19. 19.

    Aloy P, Russell RB (2004) Ten thousand interactions for the molecular biologist. Nat Biotechnol 22(10):1317–1321

  20. 20.

    Aloy P, Böttcher B, Ceulemans H, Leutwein C, Mellwig C, Fischer S, Gavin A-C, Bork P, Superti-Furga G, Serrano L, Russell RB (2004) Structure-based assembly of protein complexes in yeast. Science 303(5666):2026–2029

  21. 21.

    Olsson MHM, Parson WW, Warshel A (2006) Dynamical contributions to enzyme catalysis: critical tests of a popular hypothesis. Chem Rev 106(5):1737–1756

  22. 22.

    Simonson T (2002) Gaussian fluctuations and linear response in an electron transfer protein. Proc Natl Acad Sci U S A, 99(10):6544–6549

  23. 23.

    Carstensen T, Farrell D, Huang Y, Baker NA, Nielsen JE (2011) On the development of protein pKa calculation algorithms. Proteins. doi:10.1002/prot.23091

  24. 24.

    Benson G (2010) Editorial. Nucleic Acids Res 38(suppl 2):W1–W2

  25. 25.

    Farrell D, O’Meara F, Johnston M, Bradley J, Søndergaard CR, Georgi N, Webb H, Tynan-Connolly BM, Bjarnadottir U, Carstensen T, Nielsen JE (2010) Capturing, sharing and analysing biophysical data from protein engineering and protein characterization studies. Nucleic Acids Res 38(20):e186

  26. 26.

    Johnston MA, Søndergaard CR, Nielsen JE (2011) Integrated prediction of the effect of mutations on multiple protein characteristics. Proteins 79(1):165–178

  27. 27.

    Tynan-Connolly BM, Nielsen JE (2007) Redesigning protein pKa values. Protein Sci 16(2):239–249

  28. 28.

    Dolinsky TJ, Czodrowski P, Li H, Nielsen JE, Jensen JH, Klebe G, Baker NA (2007) PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res 35(Web Server issue):W522–W525

  29. 29.

    Guerois G, Nielsen JE, Serrano L (2002) Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol 320(2):369–387

  30. 30.

    Bode B, Halstead DM, Kendall R, Lei Z, Jackson D (2000) The portable batch scheduler and the Maui scheduler on Linux clusters. In: ALS’00: Proceedings of the 4th Annual Linux Showcase & Conference. Berkeley, CA, USA: USENIX Association, pp 27–27

  31. 31.

    Johnston MA, Nielsen JE (2011) Constructing and evaluating predictive models for protein biophysical characteristics. Ann Rep Comput Chem 7:101–122. doi:10.1016/B978-0-444-53835-2.00012-2

  32. 32.

    Serrano L, Kellis JT Jr, Cann P, Matouschek A, Fersht AR (1992) The folding of an enzyme. II. Substructure of barnase and the contribution of different interactions to protein stability. J Mol Biol 224(3):783–804

  33. 33.

    Serrano L, Sancho J, Hirshberg M, Fersht AR (1992) Alpha-helix stability in proteins. I. Empirical correlations concerning substitution of side-chains at the N and C-caps and the replacement of alanine by glycine or serine at solvent-exposed surfaces. J Mol Biol 227(2):544–559

  34. 34.

    Horovitz A, Matthews JM, Fersht AR (1992) Alpha-helix stability in proteins. II. Factors that influence stability at an internal position. J Mol Biol 227(2):560–568

  35. 35.

    Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453

  36. 36.

    Farrell D, Webb H, Johnston MA, Poulsen TA, Christensen LB, Borchert TV, Nielsen JE (2012) Towards fast determination of protein stability maps: experimental and theoretical analysis of mutants of a Nocardiopsis prasina serine protease. Biochemistry, Accepted for publication.

Download references


Funding: Science Foundation Ireland (SFI) President of Ireland Young Researcher award (Grant 04/YI1/M537 to J.E.N). SFI Research Frontiers award (Grant 08/RFP/BIC1140 to J.E.N).

Author information

Correspondence to Michael A. Johnston.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (MP4 12257 kb)

Supplementary material 1 (MP4 12257 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Johnston, M.A., Farrell, D. & Nielsen, J.E. A collaborative environment for developing and validating predictive tools for protein biophysical characteristics. J Comput Aided Mol Des 26, 387–396 (2012).

Download citation


  • Protein stability
  • Prediction
  • Protein design
  • Data analysis
  • Data integration
  • Molecular modelling