Skip to main content
Log in

Fast Side Chain Replacement in Proteins Using a Coarse-Grained Approach for Evaluating the Effects of Mutation During Evolution

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

For high-throughput structural genomic and evolutionary bioinformatics approaches, there is a clear need for fast methods to evaluate substitutions structurally. Coarse-grained methods are both powerful and fast, and a coarse-grained approach to position the substituted side chains is presented. Through the application of a coarse-grained method, a speed-up on the single- residue replacement, of at least sevenfold is achieved compared with modern all-atom approaches. At the same time, this approach maintains a small median RMSD from the leading all-atom approach (as measured in coarse-grained space), and predicts the conformation of point mutants with similar accuracy and generates biologically realistic side chain angles. This method is also substantially more predictable in its run time, making it useful for high-throughput studies of protein structural evolution. To demonstrate the utility of this method, it has been implemented in a forward simulation of sequences threaded through the SH2 domains, with selective pressures to fold and bind specifically. The relative substitution rates across the protein structure and at the binding interface are reflective of those observed in SH2 domain evolution. The algorithm has been implemented in C++, with the source code and binaries (currently supported for Linux systems) freely available as SARA at http://www.wyomingbioinformatics.org/LiberlesGroup/SARA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Bastolla U, Farwer J, Knapp EW, Vendruscolo M (2001) How to guarantee optimal stability for most representative structures in the protein data bank. Proteins Struct Funct Genet 44:79–96

    Article  PubMed  CAS  Google Scholar 

  • Brenner SE, Koehl P, Levitt M (2000) The ASTRAL compendium for protein structure and sequence analysis. Nucleic Acids Res 28:254–256

    Article  PubMed  CAS  Google Scholar 

  • Bridgham JT, Ortlund EA, Thornton JW (2009) An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature 461:515–519

    Article  PubMed  CAS  Google Scholar 

  • Canutescu AA, Shelenkov AA, Dunbrack RL (2003) A graph-theory algorithm for rapid protein sidechain prediction. Protein Sci 12:2001–2014

    Article  PubMed  CAS  Google Scholar 

  • Christ CD, Mark AE, van Gunsteren WF (2010) Basic ingredients of free energy calculations: a review. J Comput Chem 31:1569–1582

    PubMed  CAS  Google Scholar 

  • Crooks GE, Hon G, Chandonia J-M, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14:1188–1190

    Article  PubMed  CAS  Google Scholar 

  • DePristo MA, Weinreich DM, Hartl DL (2005) Missense meanderings in sequence space: a biophysical view of protein evolution. Nat Rev Genet 6:678–687

    Article  PubMed  CAS  Google Scholar 

  • Desmet J, Maeyer MD, Hazes B, Lasters I (1992) The dead-end elimination theorem and its use in protein side chain positioning. Nature 356:539–542

    Article  PubMed  CAS  Google Scholar 

  • Dill KA, Ozkan SB, Shell MS, Weikl TR (2008) The protein folding problem. Annu Rev Biophys 37:289–316

    Article  PubMed  CAS  Google Scholar 

  • Favrin G, Irbäck A, Wallin S (2002) Folding of a small helical protein using hydrogen bonds and hydrophobicity forces. Proteins 47:99–105

    Article  PubMed  CAS  Google Scholar 

  • Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K et al (2009) The Pfam protein families database. Nucleic Acids Res 38:D211–D222

    Article  PubMed  Google Scholar 

  • Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109

    Article  Google Scholar 

  • Hills RD, Lu L, Voth GA (2010) Multiscale coarse-graining of the protein energy landscape. PLoS Comput Biol 6:e1000827

    Article  PubMed  Google Scholar 

  • Holm L, Sander C (1992) Fast and simple Monte Carlo algorithm for side chain optimization in proteins: application to model building by homology. Proteins 14:213–223

    Article  PubMed  CAS  Google Scholar 

  • Huzurbazar S, Kolesov G, Massey SE, Harris KC, Churbanov A, Liberles DA (2010) Lineage-specific differences in the amino acid substitution process. J Mol Biol 396:1410–1421

    Article  PubMed  CAS  Google Scholar 

  • Kellogg EH, Leaver-Fay A, Baker D (2010) Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins. Available at: http://www.ncbi.nlm.nih.gov/pubmed/21132773. Accessed December 9, 2010

  • Khalili M, Saunders JA, Liwo A, Ołdziej S, Scheraga HA (2004) A united residue force-field for calcium–protein interactions. Protein Sci 13:2725–2735

    Article  PubMed  CAS  Google Scholar 

  • Kingsford CL, Chazelle B, Singh M (2005) Solving and analyzing side chain positioning problems using linear and integer programing. Bioinformatics 21:1028–1036

    Article  PubMed  CAS  Google Scholar 

  • Kleinman CL, Rodrigue N, Lartillot N, Philippe H (2010) Statistical potentials for improved structurally constrained evolutionary models. Mol Biol Evol 27:1546–1560

    Article  PubMed  CAS  Google Scholar 

  • Krivov GG, Shapovalov MV, Dunbrack RL (2009) Improved prediction of protein side chain conformations with SCWRL4. Proteins 77:778–795

    Article  PubMed  CAS  Google Scholar 

  • Kumar MDS, Bava KA, Gromiha MM, Prabakaran P, Kitajima K, Uedaira H, Sarai A (2006) ProTherm and ProNIT: thermodynamic databases for proteins and protein–nucleic acid interactions. Nucleic Acids Res 34:D204–D206

    Article  PubMed  CAS  Google Scholar 

  • Levitt M, Warshel A (1975) Computer simulation of protein folding. Nature 253:694–698

    Article  PubMed  CAS  Google Scholar 

  • Liang S, Grishin NV (2002) Side chain modeling with an optimized scoring function. Protein Sci 11:322–331

    Article  PubMed  CAS  Google Scholar 

  • Liberles DA, Tisdell MDM, Grahnen JA (2011) Binding constraints on the evolution of enzymes and signalling proteins: the important role of negative pleiotropy. Proc R Soc B: Biol Sci 278:1930–1935

    Article  CAS  Google Scholar 

  • Madera M, Calmus R, Thiltgen G, Karplus K, Gough J (2010) Improving protein secondary structure prediction using a simple k-mer model. Bioinformatics 26:596–602

    Article  PubMed  CAS  Google Scholar 

  • Massey SE, Churbanov A, Rastogi S, Liberles DA (2008) Characterizing positive and negative selection and their phylogenetic effects. Gene 418:22–26

    Article  PubMed  CAS  Google Scholar 

  • Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092

    Article  CAS  Google Scholar 

  • Mukherjee A, Bagchi B (2003) Correlation between rate of folding, energy landscape, and topology in the folding of a model protein HP-36. J Chem Phys 118:4733–4747

    Article  CAS  Google Scholar 

  • Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540

    PubMed  CAS  Google Scholar 

  • Parisi G, Echave J (2001) Structural constraints and emergence of sequence patterns in protein evolution. Mol Biol Evol 18:750–756

    PubMed  CAS  Google Scholar 

  • Potapov V, Cohen M, Inbar Y, Schreiber G (2010) Protein structure modelling and evaluation based on a 4-distance description of side chain interactions. BMC Bioinform 11:374

    Article  Google Scholar 

  • Poy F, Yaffe MB, Sayos J, Saxena K, Morra M, Sumegi J, Cantley LC, Terhorst C, Eck MJ (1999) Crystal structures of the XLP protein SAP reveal a class of SH2 domains with extended, phosphotyrosine-independent sequence recognition. Mol Cell 4:555–561

    Article  PubMed  CAS  Google Scholar 

  • Rastogi S, Liberles DA (2005) Subfunctionalization of duplicated genes as a transition state to neofunctionalization. BMC Evol Biol 5:28

    Article  PubMed  Google Scholar 

  • Rastogi S, Reuter N, Liberles DA (2006) Evaluation of models for the evolution of protein sequences and functions under structural constraint. Biophys Chem 124:134–144

    Article  PubMed  CAS  Google Scholar 

  • Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234:779–815

    Article  PubMed  CAS  Google Scholar 

  • Shakhnovich E, Abkevich V, Ptitsyn O (1996) Conserved residues and the mechanism of protein folding. Nature 379:96–98

    Article  PubMed  CAS  Google Scholar 

  • Sinha N, Nussinov R (2001) Point mutations and sequence variability in proteins: redistributions of preexisting populations. Proc Natl Acad Sci USA 98:3139–3144

    Article  PubMed  CAS  Google Scholar 

  • Summa CM, Levitt M (2007) Near-native structure refinement using in vacuo energy minimization. Proc Natl Acad Sci USA 104:3177–3182

    Article  PubMed  CAS  Google Scholar 

  • Tokuriki N, Tawfik DS (2009) Protein dynamism and evolvability. Science 324:203–207

    Article  PubMed  CAS  Google Scholar 

  • Tozzini V (2005) Coarse-grained models for proteins. Curr Opin Struct Biol 15:144–150

    Article  PubMed  CAS  Google Scholar 

  • Voelz VA, Bowman GR, Beauchamp K, Pande VS (2010) Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1–39). J Am Chem Soc 132:1526–1528

    Article  PubMed  CAS  Google Scholar 

  • Voigt CA, Gordon DB, Mayo SL (2000) Trading accuracy for speed: a quantitative comparison of search algorithms in protein sequence design. J Mol Biol 299:789–803

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This study was supported by an institutional NIH INBRE award to University of Wyoming (P20 RR016474). Jan Kubelka is supported by NSF CAREER award 0846140. David Liberles receives support from NSF award DBI-0743374.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Johan A. Grahnen or David A. Liberles.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 50 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grahnen, J.A., Kubelka, J. & Liberles, D.A. Fast Side Chain Replacement in Proteins Using a Coarse-Grained Approach for Evaluating the Effects of Mutation During Evolution. J Mol Evol 73, 23–33 (2011). https://doi.org/10.1007/s00239-011-9454-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-011-9454-3

Keywords

Navigation