Skip to main content

An Evolution-Based Approach to De Novo Protein Design

Part of the Methods in Molecular Biology book series (MIMB,volume 1529)

Abstract

EvoDesign is a computational algorithm that allows the rapid creation of new protein sequences that are compatible with specific protein structures. As such, it can be used to optimize protein stability, to resculpt the protein surface to eliminate undesired protein-protein interactions, and to optimize protein-protein binding. A major distinguishing feature of EvoDesign in comparison to other protein design programs is the use of evolutionary information in the design process to guide the sequence search toward native-like sequences known to adopt structurally similar folds as the target. The observed frequencies of amino acids in specific positions in the structure in the form of structural profiles collected from proteins with similar folds and complexes with similar interfaces can implicitly capture many subtle effects that are essential for correct folding and protein-binding interactions. As a result of the inclusion of evolutionary information, the sequences designed by EvoDesign have native-like folding and binding properties not seen by other physics-based design methods. In this chapter, we describe how EvoDesign can be used to redesign proteins with a focus on the computational and experimental procedures that can be used to validate the designs.

Key words

  • Protein design
  • Evolutionary profile
  • Protein structure modeling
  • Experimental protein validation
  • Recombinant expression
  • Circular dichroism
  • Nuclear magnetic resonance

This is a preview of subscription content, access via your institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4939-6637-0_12
  • Chapter length: 22 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-1-4939-6637-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Hardcover Book
USD   199.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Karanicolas J, Kuhlman B (2009) Computational design of affinity and specificity at protein-protein interfaces. Curr Opin Struct Biol 19(4):458–463

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  2. Kortemme T, Joachimiak LA, Bullock AN, Schuler AD, Stoddard BL, Baker D (2004) Computational redesign of protein-protein interaction specificity. Nat Struct Mol Biol 11(4):371–379

    CAS  CrossRef  PubMed  Google Scholar 

  3. Shifman JM, Mayo SL (2003) Exploring the origins of binding specificity through the computational redesign of calmodulin. Proc Natl Acad Sci U S A 100(23):13274–13279

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  4. Lopes A, Busch MSA, Simonson T (2010) Computational design of protein-ligand binding: modifying the specificity of asparaginyl-tRNA synthetase. J Comput Chem 31(6):1273–1286

    CAS  PubMed  Google Scholar 

  5. Procko E, Hedman R, Hamilton K, Seetharaman J, Fleishman SJ, Su M, Aramini J, Kornhaber G, Hunt JF, Tong L, Montelione GT, Baker D (2013) Computational design of a protein-based enzyme inhibitor. J Mol Biol 425(18):3563–3575

    CAS  CrossRef  PubMed  Google Scholar 

  6. Jiang L, Althoff EA, Clemente FR, Doyle L, Rothlisberger D, Zanghellini A, Gallaher JL, Betker JL, Tanaka F, Barbas CF 3rd, Hilvert D, Houk KN, Stoddard BL, Baker D (2008) De novo computational design of retro-aldol enzymes. Science 319(5868):1387–1391

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  7. Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D (2003) Design of a novel globular protein fold with atomic-level accuracy. Science 302(5649):1364–1368

    CAS  CrossRef  PubMed  Google Scholar 

  8. Siegel JB, Smith AL, Poust S, Wargacki AJ, Bar-Even A, Louw C, Shen BW, Eiben CB, Tran HM, Noor E, Gallaher JL, Bale J, Yoshikuni Y, Gelb MH, Keasling JD, Stoddard BL, Lidstrom ME, Baker D (2015) Computational protein design enables a novel one-carbon assimilation pathway. Proc Natl Acad Sci U S A 112(12):3704–3709

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Ollikainen N, Kortemme T (2013) Computational protein design quantifies structural constraints on amino acid covariation. PLoS Comput Biol 9(11), e1003313

    CrossRef  PubMed  PubMed Central  Google Scholar 

  10. Fromer M, Linial M (2010) Exposing the co-adaptive potential of protein-protein interfaces through computational sequence design. Bioinformatics 26(18):2266–2272

    CAS  CrossRef  PubMed  Google Scholar 

  11. McLaughlin RN, Poelwijk FJ, Raman A, Gosal WS, Ranganathan R (2012) The spatial architecture of protein function and adaptation. Nature 491(7422):138–142

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  12. Schaefer C, Schlessinger A, Rost B (2010) Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be. Bioinformatics 26(5):625–631

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  13. Ollikainen N, Smith CA, Fraser JS, Kortemme T (2013) Flexible backbone sampling methods to model and design protein alternative conformations. Methods Enzymol 523:61–85

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  14. Kellogg EH, Leaver-Fay A, Baker D (2011) Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins 79(3):830–838

    CAS  CrossRef  PubMed  Google Scholar 

  15. Chiti F, Stefani M, Taddei N, Ramponi G, Dobson CM (2003) Rationalization of the effects of mutations on peptide and protein aggregation rates. Nature 424(6950):805–808

    CAS  CrossRef  PubMed  Google Scholar 

  16. Smith CA, Kortemme T (2011) Predicting the tolerated sequences for proteins and protein interfaces using RosettaBackrub flexible backbone design. PLoS One 6(7)

    Google Scholar 

  17. Harbury PB, Plecs JJ, Tidor B, Alber T, Kim PS (1998) High-resolution protein design with backbone freedom. Science 282(5393):1462–1467

    CAS  CrossRef  PubMed  Google Scholar 

  18. Pokala N, Handel TM (2005) Energy functions for protein design: adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity. J Mol Biol 347(1):203–227

    CAS  CrossRef  PubMed  Google Scholar 

  19. Li Z, Yang Y, Zhan J, Dai L, Zhou Y (2013) Energy functions in de novo protein design: current challenges and future prospects. Annu Rev Biophys 42:315–335

    CAS  CrossRef  PubMed  Google Scholar 

  20. Jacak R, Leaver-Fay A, Kuhlman B (2012) Computational protein design with explicit consideration of surface hydrophobic patches. Proteins 80(3):825–838

    CAS  CrossRef  PubMed  Google Scholar 

  21. Bowie JU, Luthy R, Eisenberg D (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253(5016):164–170

    CAS  CrossRef  PubMed  Google Scholar 

  22. Soding J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21(7):951–960

    CrossRef  PubMed  Google Scholar 

  23. Wu S, Zhang Y (2008) MUSTER: improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 72(2):547–556

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  24. Zhang Y (2008) Progress and challenges in protein structure prediction. Curr Opin Struct Biol 18(3):342–348

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  25. Mitra P, Shultis D, Brender JR, Czajka J, Marsh D, Gray F, Cierpicki T, Zhang Y (2013) An evolution-based approach to de novo protein design and case study on Mycobacterium tuberculosis. PLoS Comput Biol 9(10), e1003298

    CrossRef  PubMed  PubMed Central  Google Scholar 

  26. Mitra P, Shultis D, Zhang Y (2013) EvoDesign: de novo protein design based on structural and evolutionary profiles. Nucleic Acids Res 41(W1):W273–W280

    CrossRef  PubMed  PubMed Central  Google Scholar 

  27. Zhang Y, Skolnick J (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33(7):2302–2309

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  28. Xu J, Zhang Y (2010) How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26(7):889–895

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  29. Gribskov M, Homyak M, Edenfield J, Eisenberg D (1988) Profile scanning for 3-dimensional structural patterns in protein sequences. Comput Appl Biosci 4(1):61–66

    CAS  PubMed  Google Scholar 

  30. Gribskov M, Mclachlan AD, Eisenberg D (1987) Profile analysis – detection of distantly related proteins. Proc Natl Acad Sci U S A 84(13):4355–4358

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  31. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 89(22):10915–10919

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  32. Wu ST, Zhang Y (2008) ANGLOR: a composite machine-learning algorithm for protein backbone torsion angle prediction. PLoS One 3(10)

    Google Scholar 

  33. Chen HL, Zhou HX (2005) Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res 33(10):3193–3199

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  34. Faraggi E, Zhang T, Yang YD, Kurgan L, Zhou YQ (2012) SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles. J Comput Chem 33(3):259–267

    CAS  CrossRef  PubMed  Google Scholar 

  35. Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L (2005) The FoldX web server: an online force field. Nucleic Acids Res 33(Web Server issue):382–388

    CrossRef  Google Scholar 

  36. Krivov GG, Shapovalov MV, Dunbrack RL (2009) Improved prediction of protein side-chain conformations with SCWRL4. Proteins 77(4):778–795

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  37. Zhang Y, Skolnick J (2004) SPICKER: a clustering approach to identify near-native protein folds. J Comput Chem 25(6):865–871

    CAS  CrossRef  PubMed  Google Scholar 

  38. Bazzoli A, Tettamanzi AGB, Zhang Y (2011) Computational protein design and large-scale assessment by I-TASSER structure assembly simulations. J Mol Biol 407(5):764–776

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  39. Brender JR, Zhang Y (2015) Recognizing mutations on protein-protein binding interactions through structure-based interface profiles. PLoS Comput Biol (in press)

    Google Scholar 

  40. Mukherjee S, Zhang Y (2011) Protein-protein complex structure predictions by multimeric threading and template recombination. Structure 19(7):955–966

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  41. Gao M, Skolnick J (2010) iAlign: a method for the structural comparison of protein-protein interfaces. Bioinformatics 26(18):2259–2265

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  42. Zhang Y (2012) http://zhanglab.ccmb.med.umich.edu/PSSpred

  43. Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y (2015) The I-TASSER suite: protein structure and function prediction. Nat Methods 12(1):7–8

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  44. Davis IW, Arendall WB, Richardson DC, Richardson JS (2006) The backrub motion: how protein backbone shrugs when a sidechain dances. Structure 14(2):265–274

    CAS  CrossRef  PubMed  Google Scholar 

  45. Smith CA, Kortemme T (2008) Backrub-like backbone simulation recapitulates natural protein conformational variability and improves mutant side-chain prediction. J Mol Biol 380(4):742–756

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  46. Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5(4):725–738

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  47. Wu S, Skolnick J, Zhang Y (2007) Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol 5:17

    CrossRef  PubMed  PubMed Central  Google Scholar 

  48. Zhang Y (2007) Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69(S8):108–117

    CAS  CrossRef  PubMed  Google Scholar 

  49. Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9:40

    CrossRef  PubMed  PubMed Central  Google Scholar 

  50. Kopp J, Bordoli L, Battey JN, Kiefer F, Schwede T (2007) Assessment of CASP7 predictions for template-based modeling targets. Proteins 69(Suppl 8):38–56

    CAS  CrossRef  PubMed  Google Scholar 

  51. Cozzetto D, Kryshtafovych A, Fidelis K, Moult J, Rost B, Tramontano A (2009) Evaluation of template-based models in CASP8 with standard measures. Proteins 77(Suppl 9):18–28

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  52. Montelione GT (2012) Template based modeling assessment in CASP10. Paper presented at the 10th community wide experiment on the critical assessment of techniques for protein structure prediction, Gaeta, Italy, 9–12 Dec 2012

    Google Scholar 

  53. Lee BK (2012) Template free modeling assessment in CASP10. Paper presented at the 10th community wide experiment on the critical assessment of techniques for protein structure prediction, Gaeta, Italy

    Google Scholar 

  54. Moult J, Pedersen JT, Judson R, Fidelis K (1995) A large-scale experiment to assess protein structure prediction methods. Proteins 23(3):2–5

    CrossRef  Google Scholar 

  55. Moult J, Fidelis K, Kryshtafovych A, Rost B, Tramontano A (2009) Critical assessment of methods of protein structure prediction-round VIII. Proteins Struct Funct Bioinf 77:1–4

    CAS  CrossRef  Google Scholar 

  56. Moult J (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15(3):285–289

    CAS  CrossRef  PubMed  Google Scholar 

  57. Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9

    Google Scholar 

  58. Shultis D, Mitra P, Aslam N, Gray F, Piper C, Chinnaswamy K, Stuckey J, Cierpicki T, Wang S, Lei M, Zhang Y (2015) Redesigning the fold and binding specificity of BIR3 domain of X-linked inhibitor of apoptosis proteins using evolutionary profiles (submitted)

    Google Scholar 

  59. Rosano GL, Ceccarelli EA (2014) Recombinant protein expression in Escherichia coli: advances and challenges. Front Microbiol 5:172

    PubMed  PubMed Central  Google Scholar 

  60. Prinz WA, Aslund F, Holmgren A, Beckwith J (1997) The role of the thioredoxin and glutaredoxin pathways in reducing protein disulfide bonds in the Escherichia coli cytoplasm. J Biol Chem 272(25):15661–15667

    CAS  CrossRef  PubMed  Google Scholar 

  61. Buchan JR, Stansfield I (2007) Halting a cellular production line: responses to ribosomal pausing during translation. Biol Cell 99(9):475–487

    CAS  CrossRef  PubMed  Google Scholar 

  62. Shultis D, Czajka J, Marsh D, Gray F, Brender JR, Mitra P, Cierpicki T, Zhang Y. Structural validation of computational protein designed through evolutionary methods (in preparation)

    Google Scholar 

  63. Baneyx F (1999) Recombinant protein expression in Escherichia coli. Curr Opin Biotechnol 10(5):411–421

    CAS  CrossRef  PubMed  Google Scholar 

  64. Jana S, Deb JK (2005) Strategies for efficient production of heterologous proteins in Escherichia coli. Appl Microbiol Biotechnol 67(3):289–298

    CAS  CrossRef  PubMed  Google Scholar 

  65. Burgess RR (2009) Refolding solubilized inclusion body proteins. Methods Enzymol 463:259–282

    CAS  CrossRef  PubMed  Google Scholar 

  66. DelProposto J, Majmudar CY, Smith JL, Brown WC (2009) Mocr: a novel fusion tag for enhancing solubility that is compatible with structural biology applications. Protein Expr Purif 63(1):40–49

    CAS  CrossRef  PubMed  Google Scholar 

  67. Dantas G, Kuhlman B, Callender D, Wong M, Baker D (2003) A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. J Mol Biol 332(2):449–460

    CAS  CrossRef  PubMed  Google Scholar 

  68. Koga N, Tatsumi-Koga R, Liu GH, Xiao R, Acton TB, Montelione GT, Baker D (2012) Principles for designing ideal protein structures. Nature 491(7423):222

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  69. Socolich M, Lockless SW, Russ WP, Lee H, Gardner KH, Ranganathan R (2005) Evolutionary information for specifying a protein fold. Nature 437(7058):512–518

    CAS  CrossRef  PubMed  Google Scholar 

  70. Sreerama N, Woody RW (2000) Analysis of protein CD spectra: comparison of CONTIN, SELCON3, and CDSSTR methods in CDPro software. Biophys J 78(1):334

    CrossRef  Google Scholar 

  71. Oberg KA, Ruysschaert JM, Goormaghtigh E (2004) The optimization of protein secondary structure determination with infrared and circular dichroism spectra. Eur J Biochem 271(14):2937–2948

    CAS  CrossRef  PubMed  Google Scholar 

  72. Rehm T, Huber R, Holak TA (2002) Application of NMR in structural proteomics: screening for proteins amenable to structural analysis. Structure 10(12):1613–1618

    CAS  CrossRef  PubMed  Google Scholar 

  73. Scheich C, Leitner D, Sievert V, Leidert M, Schlegel B, Simon B, Letunic I, Bussow K, Diehl A (2004) Fast identification of folded human protein domains expressed in E. coli suitable for structural analysis. BMC Struct Biol 4:4

    CrossRef  PubMed  PubMed Central  Google Scholar 

  74. Hoffmann B, Eichmuller C, Steinhauser O, Konrat R (2005) Rapid assessment of protein structural stability and fold validation via NMR. Methods Enzymol 394:142

    CAS  CrossRef  PubMed  Google Scholar 

  75. Schedlbauer A, Coudevylle N, Auer R, Kloiber K, Tollinger M, Konrat R (2009) Autocorrelation analysis of NOESY data provides residue compactness for folded and unfolded proteins. J Am Chem Soc 131(17):6038

    CAS  CrossRef  PubMed  Google Scholar 

  76. Niesen FH, Berglund H, Vedadi M (2007) The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nat Protoc 2(9):2212–2221

    CAS  CrossRef  PubMed  Google Scholar 

  77. Pace CN, Scholtz JM (1997) Measuring the conformational stability of a protein. In: Creighton TE (ed) Protein structure: a practical approach. Oxford University Press, New York, NY, pp 299–321

    Google Scholar 

  78. Shultis D, Dodge G, Zhang Y (2015) Crystal structure of designed PX domain from cytokine-independent survival kinase and implications on evolution-based protein engineering (submitted)

    Google Scholar 

  79. Price WN 2nd, Chen Y, Handelman SK, Neely H, Manor P, Karlin R, Nair R, Liu J, Baran M, Everett J, Tong SN, Forouhar F, Swaminathan SS, Acton T, Xiao R, Luft JR, Lauricella A, DeTitta GT, Rost B, Montelione GT, Hunt JF (2009) Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data. Nat Biotechnol 27(1):51–57

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  80. O'Hare B, Benesi AJ, Showalter SA (2009) Incorporating 1H chemical shift determination into 13C-direct detected spectroscopy of intrinsically disordered proteins in solution. J Magn Reson 200(2):354–358

    CrossRef  PubMed  Google Scholar 

  81. Zhang Y, Hubner IA, Arakaki AK, Shakhnovich E, Skolnick J (2006) On the origin and highly likely completeness of single-domain protein structures. Proc Natl Acad Sci U S A 103(8):2605–2610

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  82. Brylinski M, Gao M, Skolnick J (2011) Why not consider a spherical protein? Implications of backbone hydrogen bonding for protein structure and function. Phys Chem Chem Phys 13(38):17044–17055

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgment

The project is supported in part by the National Institute of General Medical Sciences (GM083107).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this protocol

Cite this protocol

Brender, J.R., Shultis, D., Khattak, N.A., Zhang, Y. (2017). An Evolution-Based Approach to De Novo Protein Design. In: Samish, I. (eds) Computational Protein Design. Methods in Molecular Biology, vol 1529. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6637-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-6637-0_12

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-6635-6

  • Online ISBN: 978-1-4939-6637-0

  • eBook Packages: Springer Protocols