In Silico-Directed Evolution Using CADEE

  • Beat Anton Amrein
  • Ashish Runthala
  • Shina Caroline Lynn KamerlinEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1851)


Recent years have seen an explosion of interest in both sequence- and structure-based approaches toward in silico-directed evolution. We recently developed a novel computational toolkit, CADEE, which facilitates the computer-aided directed evolution of enzymes. Our initial work (Amrein et al., IUCrJ 4:50–64, 2017) presented a pedagogical example of the application of CADEE to triosephosphate isomerase, to illustrate the CADEE workflow. In this contribution, we describe this workflow in detail, including code input/output snippets, in order to allow users to set up and execute CADEE simulations on any system of interest.

Key words

Enzyme design Directed evolution Computational enzymology Computational enzyme design Empirical valence bond 



The European Research Council provided financial support under the European Community’s Seventh Framework Programme (FP7/2007-2013)/ERC Grant Agreement 306474. SCLK would also like to thank the Knut and Alice Wallenberg Foundation and the Royal Swedish Academy of Sciences for a Wallenberg Academy Fellowship, and the Swedish Research Council for providing support through project grant 2015-04928. All calculations were performed on the Abisko cluster at the HPC2N center in Umeå and on the Triolith cluster at the NSC in Linköping, thanks to a generous supercomputing allocation provided by the Swedish National Infrastructure for Computing (SNIC grant 2015/16-12). In addition, we would like to thank Arina Gromova for extensive testing of CADEE, Fabian Steffen-Munsberg for initial testing, and Miha Purg for helpful discussions about qscripts/qtools.


  1. 1.
    Bornscheuer UT (1998) Directed evolution of enzymes. Angew Chem Int Ed 37:3105–3108CrossRefGoogle Scholar
  2. 2.
    Bull AT, Ward AC, Goodfellow M (2000) Search and discovery strategies for biotechnology: the paradigm shift. Microbiol Mol Biol Rev 64:573–606CrossRefGoogle Scholar
  3. 3.
    Tao H, Cornish VW (2002) Milestones in directed enzyme evolution. Curr Opin Chem Biol 6:858–864CrossRefGoogle Scholar
  4. 4.
    Currin A, Swainston N, Day PJ, Kell DB (2015) Synthetic biology for the directed evolution of biocatalysts: navigating sequence space intelligently. Chem Soc Rev 44:1172–1239CrossRefGoogle Scholar
  5. 5.
    Packer MS, Liu DR (2015) Methods for the directed evolution of proteins. Nat Rev Genet 16:79–394CrossRefGoogle Scholar
  6. 6.
    Arnold FH, Volkov AA (1999) Directed evolution of biocatalysts. Curr Opin Chem Biol 3:54–59CrossRefGoogle Scholar
  7. 7.
    Jäckel C, Kast P, Hilvert D (2008) Protein design by directed evolution. Annu Rev Biophys 37:153–173CrossRefGoogle Scholar
  8. 8.
    Currin A, Swainston N, Day PJ, Kell DB (2015) Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chem Soc Rev 44:1172–1239CrossRefGoogle Scholar
  9. 9.
    Romero PA, Arnold FH (2009) Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Biol 10:866–876CrossRefGoogle Scholar
  10. 10.
    Gumulya Y, Sanchis J, Reetz MT (2012) Many pathways in laboratory evolution can lead to improved enzymes: how to escape from local minima. ChemBioChem 13:1060–1066CrossRefGoogle Scholar
  11. 11.
    Barrozo A, Borstnar R, Marloie G, Kamerlin SCL (2012) Computational protein engineering: bridging the gap between rational design and laboratory evolution. Int J Mol Sci 13:12428–12460CrossRefGoogle Scholar
  12. 12.
    Kiss G, Çelebi-Ölçum N, Moretti R, Baker D, Houk KN (2012) Computational enzyme design. Angew Chem Int Ed 52:5700–5725CrossRefGoogle Scholar
  13. 13.
    Romero-Rivera A, Garcia-Borràs M, Osuna S (2017) Computational tools for the evaluation of laboratory-engineered biocatalysts. Chem Commun 53:284–297CrossRefGoogle Scholar
  14. 14.
    Amrein BA, Steffen-Munsberg F, Szeler I, Purg M, Kulkarni Y, Kamerlin SCL (2017) CADEE: computer-aided directed evolution of enzymes. IUCrJ 4:50–64CrossRefGoogle Scholar
  15. 15.
    Warshel A, Weiss RM (1980) An empirical valence bond approach for comparing reactions in solutions and in enzymes. J Am Chem Soc 102:6218–6226CrossRefGoogle Scholar
  16. 16.
    Warshel A, Sharma PK, Kato M, Xiang Y, Liu H, Olsson MHM (2006) Electrostatic basis for enzyme catalysis. Chem Rev 106:320–3235CrossRefGoogle Scholar
  17. 17.
    Kamerlin SCL, Warshel A (2010) The EVB as a quantitative tool for formulating simulations and analyzing biological and chemical reactions. Faraday Discuss 145:71–106CrossRefGoogle Scholar
  18. 18.
    Luo J, van Loo B, Kamerlin SCL (2012) Examining the promiscuous phosphatase activity of Pseudomonas aeruginosa arylsulfatase: a comparison to analogous phosphatases. Proteins Struct Funct Bioinf 80:1211–1226CrossRefGoogle Scholar
  19. 19.
    Barrozo A, Duarte F, Bauer P, Carvalho ATP, Kamerlin SCL (2015) Cooperative electrostatic interactions drive functional evolution in the alkaline phosphatase superfamily. J Am Chem Soc 137:9061–9076CrossRefGoogle Scholar
  20. 20.
    Q Official Website.
  21. 21.
    Manual for the molecular Dynamics package Q.
  22. 22.
  23. 23.
    O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open babel: an open chemical toolbox. J Cheminform 3:33–33CrossRefGoogle Scholar
  24. 24.
    Krivov GG, Shapovalov MV, Dunbrack RL (2009) Improved prediction of protein side-chain conformations with SCWRL4. Proteins Struct Funct Bioinf 77:778–795CrossRefGoogle Scholar
  25. 25.
    Frushicheva MP, Cao J, Chu ZT, Warshel A (2010) Exploring challenges in rational enzyme design by simulating the catalysis in artificial Kemp eliminase. Proc Natl Acad Sci 107:16869–16874CrossRefGoogle Scholar
  26. 26.
    Frushicheva MP, Cao J, Warshel A (2011) Challenges and advances in validating enzyme design proposals: the case of Kemp eliminase catalysis. Biochemistry 50:3849–3858CrossRefGoogle Scholar
  27. 27.
    Kamerlin SCL, Warshel A (2011) The empirical valence bond model: theory and applications. WIREs Comput Mol Sci 1:30–45CrossRefGoogle Scholar
  28. 28.
    Amrein BA, Bauer P, Duarte F, Janfalk Carlsson Å, Naworyta A, Mowbray SL, Widersten M, Kamerlin SCL (2015) Expanding the catalytic triad in epoxide hydrolases and related enzymes. ACS Catal 5:5702–5713CrossRefGoogle Scholar
  29. 29.
    Ben-David M, Sussman JL, Maxwell CI, Szeler K, Kamerlin SCL, Tawfik DS (2015) Catalytic stimulation by restrained active-site floppiness—the case of high density lipoprotein-bound serum paraoxonase-1. J Mol Biol 427:1359–1374CrossRefGoogle Scholar
  30. 30.
    Roca M, Vardi-Kilshtain A, Warshel A (2009) Toward accurate screening in computer-aided enzyme design. Biochemistry 48:3046–3056CrossRefGoogle Scholar
  31. 31.
    Frushicheva MP, Mills MJL, Schopf P, Singh MK, Prasad RB, Warshel A (2014) Computer aided enzyme design and catalytic concepts. Curr Opin Chem Biol 21:56–62CrossRefGoogle Scholar
  32. 32.
    Carvalho ATP, Barrozo A, Doron D, Kilshtain AV, Major DT, Kamerlin SCL (2014) Challenges in computational studies of enzyme structure, function and dynamics. J Mol Graph Model 54:62–79CrossRefGoogle Scholar
  33. 33.
    King G, Warshel A (1989) A surface constrained all-atom solvent model for effective simulations of polar solutions. J Chem Phys 91:3647–3661CrossRefGoogle Scholar
  34. 34.
    Lee FS, Warshel A (1992) A local reaction field method for fast evaluation of long-range electrostatic interactions in molecular simulations. J Chem Phys 97:3100–3107CrossRefGoogle Scholar
  35. 35.
    Stallman RM (2009) GCC developer community, using the Gnu compiler collection: A Gnu manual for Gcc version 4.3.3. CreateSpace. p 636Google Scholar
  36. 36.
    Gabriel E, Fagg GE, Bosilca G, Angskun T, Dongarra JJ, Squyres JM, Sahay V, Kambadur P, Barrett B, Lumsdaine A, Castain RH, Daniel DJ, Graham RL, Woodall TS (2004) Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Kranzlmüller D, Kacsuk P, Dongarra J (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface: 11th European PVM/MPI Users’ Group Meeting Budapest, Hungary, September 19–22, 2004. Proceedings. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 97–104CrossRefGoogle Scholar
  37. 37.
    Gropp W (2002) MPICH2: A New Start for MPI Implementations. In: Proceedings of the 9th European PVM/MPI Users' Group Meeting on recent advances in parallel virtual machine and message passing interface, Springer-Verlag, p 7Google Scholar
  38. 38.
    Python Software Foundation. Python Language Reference, version 2.7.
  39. 39.
    Marelius J, Kolmodin K, Feierberg I, Åqvist J (1998) Q: A molecular dynamics program for free energy calculations and empirical valence bond simulations in biomolecular systems. J Mol Graph Model 16:213–225CrossRefGoogle Scholar
  40. 40.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242CrossRefGoogle Scholar
  41. 41.
    Berman HM, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Mol Biol 10:980–980CrossRefGoogle Scholar
  42. 42.
  43. 43.
    Reetz MT, Wu S (2008) Greatly reduced amino acid alphabets in directed evolution: making the right choice for saturation mutagenesis at homologous enzyme positions. Chem Commun 21:5499–5501CrossRefGoogle Scholar
  44. 44.
    Murzin AG, Brenner SE, Hubbart T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540PubMedGoogle Scholar
  45. 45.
    Cheng H, Schaeffer RD, Liao Y, Kinch LN, Pei J, Shi S, Kim BH, Grishin NV (2014) ECOD: an evolutionary classification of protein domains. PLoS Comput Biol 10:e1003926CrossRefGoogle Scholar
  46. 46.
    Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer ELL, Tate J, Punta M (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230CrossRefGoogle Scholar
  47. 47.
    Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Jackson JD, Ke Z, Lanczycki J, Lu F, Marchler GH, Mullokandov M, Omelchenko MV, Robertson CL, Song JS, Thanki N, Yamashita RA, Zhang D, Zhang N, Zheng C, Bryant SH (2011) CDD: a conserved domain database for the functional annotation of proteins. Nucleic Acids Res 39(Database):D225–D229CrossRefGoogle Scholar
  48. 48.
    Ponting CP, Schultz J, Milpetz F, Bork P (1999) SMART: identification and annotation of domains from signalling and extracellular protein sequences. Nucleic Acids Res 27:229–232CrossRefGoogle Scholar
  49. 49.
    Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31:371–373CrossRefGoogle Scholar
  50. 50.
    Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195–202CrossRefGoogle Scholar
  51. 51.
    Buchan DWA, Minneci F, Nugent TCO, Bryson K, Jones DT (2013) Scalable web services for the PSIPRED Protein Analysis Workbench. Nucleic Acids Res 41(W1):W340–W348CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Beat Anton Amrein
    • 1
  • Ashish Runthala
    • 3
  • Shina Caroline Lynn Kamerlin
    • 2
    Email author
  1. 1.Associate Scientist, Tecan Schweiz AGMännedorfSwitzerland
  2. 2.Department of ChemistryBMC, Uppsala UniversityUppsalaSweden
  3. 3.Indian Institute of ScienceBangaloreIndia

Personalised recommendations