Abstract
Directed evolution methods have proved to be highly effective in the design of novel proteins and in the generation of large libraries of diverse sequences. However, searching through the vast number of mutants produced during such experiments in order to find the best represents a daunting and difficult task. In recent years, a number of computational tools have been developed to provide guidance during this exploratory process. It can, however, be unclear as to which tool or tools best complement the chosen library design strategy. In this review, we describe and critically evaluate some of the more notable tools in this area, discussing the rationale behind each, the requirements for their implementation, and potential issues faced when using them. Some examples of their application in an experimental setting are also provided. The tools have been classified based on contrasting strategies as to how they function: prospective tools SCHEMA and OPTCOMB use extant sequence and structural data to predict optimal locations for crossover sites, whereas retrospective tools ProSAR and ASRA use property data from the mutant library to predict beneficial mutations and features. From our evaluation, we suggest that each tool can play a role in the design process; however this is largely dictated by the data available and the desired experimental strategy for the project.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Peisajovich SG, Tawfik DS (2007) Protein engineers turned evolutionists. Nat Methods 4(12):991–994
Voigt CA, Martinez C, Wang ZG, Mayo SL, Arnold FH (2002) Protein building blocks preserved by recombination. Nat Struct Biol 9(7):553–558
Saraf MC, Gupta A, Maranas CD (2005) Design of combinatorial protein libraries of optimal size. Proteins 60(4):769–777
Pantazes RJ, Saraf MC, Maranas CD (2007) Optimal protein library design using recombination or point mutations based on sequence-based scoring functions. Protein Eng Des Sel 20(8):361–373
Feng XJ, Sanchis J, Reetz MT, Rabitz H (2012) Enhancing the efficiency of directed evolution in focused enzyme libraries by the adaptive substituent reordering algorithm. Chemistry 18(18):5646–5654
Fox RJ, Davis SC, Mundorff EC, Newman LM, Gavrilovic V, Ma SK, Chung LM, Ching C, Tam S, Muley S, Grate J, Gruber J, Whitman JC, Sheldon RA, Huisman GW (2007) Improving catalytic function by ProSAR-driven enzyme evolution. Nat Biotechnol 25(3):338–344
Meyer MM, Silberg JJ, Voigt CA, Endelman JB, Mayo SL, Wang ZG, Arnold FH (2003) Library analysis of SCHEMA-guided protein recombination. Protein Sci 12(8):1686–1693
Meyer MM, Hochrein L, Arnold FH (2006) Structure-guided SCHEMA recombination of distantly related beta-lactamases. Protein Eng Des Sel 19(12):563–570
Otey CR, Landwehr M, Endelman JB, Hiraga K, Bloom JD, Arnold FH (2006) Structure-guided recombination creates an artificial family of cytochromes P450. PLoS Biol 4(5):e112
Romero PA, Stone E, Lamb C, Chantranupong L, Krause A, Miklos AE, Hughes RA, Fechtel B, Ellington AD, Arnold FH, Georgiou G (2012) SCHEMA-designed variants of human arginase I and II reveal sequence elements important to stability and catalysis. ACS Synth Biol 1(6):221–228
Endelman JB, Silberg JJ, Wang ZG, Arnold FH (2004) Site-directed protein recombination as a shortest-path problem. Protein Eng Des Sel 17(7):589–594
Smith MA, Rentmeister A, Snow CD, Wu T, Farrow MF, Mingardon F, Arnold FH (2012) A diverse set of family 48 bacterial glycoside hydrolase cellulases created by structure-guided recombination. FEBS J 279(24):4453–4465
Otey CR, Silberg JJ, Voigt CA, Endelman JB, Bandara G, Arnold FH (2004) Functional evolution and structural conservation in chimeric cytochromes P450: calibrating a structure-guided approach. Chem Biol 11:309–318
Heinzelman P, Komor R, Kanaan A, Romero P, Yu XL, Mohler S, Snow C, Arnold F (2010) Efficient screening of fungal cellobiohydrolase class I enzymes for thermostabilizing sequence blocks by SCHEMA structure-guided recombination. Protein Eng Des Sel 23(11):871–880
Heinzelman P, Snow CD, Smith MA, Yu XL, Kannan A, Boulware K, Villalobos A, Govindarajan S, Minshull J, Arnold FH (2009) SCHEMA recombination of a fungal cellulase uncovers a single mutation that contributes markedly to stability. J Biol Chem 284(39):26229–26233
Heinzelman P, Snow CD, Wu I, Nguyen C, Villalobos A, Govindarajan S, Minshull J, Arnold FH (2009) A family of thermostable fungal cellulases created by structure-guided recombination. Proc Natl Acad Sci U S A 106(14):5610–5615
Li YG, Drummond DA, Sawayama AM, Snow CD, Bloom JD, Arnold FH (2007) A diverse family of thermostable cytochrome P450s created by recombination of stabilizing fragments. Nat Biotechnol 25(9):1051–1056
Landwehr M, Carbone M, Otey CR, Li YG, Arnold FH (2007) Diversification of catalytic function in a synthetic family of chimeric cytochrome P450s. Chem Biol 14(3):269–278
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and clustal X version 2.0. Bioinformatics 23(21):2947–2948
Eswar N, John B, Mirkovic N, Fiser A, Ilyin VA, Pieper U, Stuart AC, Marti-Renom MA, Madhusudhan MS, Yerkovich B, Sali A (2003) Tools for comparative protein structure modeling and analysis. Nucleic Acids Res 31(13):3375–3380
Arnold K, Bordoli L, Kopp J, Schwede T (2006) The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics 22(2):195–201
Kiefer F, Arnold K, Kunzli M, Bordoli L, Schwede T (2009) The SWISS-MODEL repository and associated resources. Nucleic Acids Res 37:D387–D392
Peitsch MC (1995) Protein modeling by e-mail. BioTechnology 13(7):658–660
Kelley LA, Sternberg MJE (2009) Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc 4(3):363–371
Saraf MC, Horswill AR, Benkovic SJ, Maranas CD (2004) FamClash: a method for ranking the activity of engineered enzymes. Proc Natl Acad Sci U S A 101(12):4142–4147
Hiraga K, Arnold FH (2003) General method for sequence-independent site-directed chimeragenesis. J Mol Biol 330:287–296
Coco WM, Encell LP, Levinson WE, Crist MJ, Loomis AK, Licato LL, Arensdorf JJ, Sica N, Pienkos PT, Monticello DJ (2002) Growth factor engineering by degenerate homoduplex gene family recombination. Nat Biotechnol 20(12):1246–1250
Ness JE, Kim S, Gottman A, Pak R, Krebber A, Borchert TV, Govindarajan S, Mundorff EC, Minshull J (2002) Synthetic shuffling expands functional protein diversity by allowing amino acids to recombine independently. Nat Biotechnol 20(12):1251–1255
Saraf MC, Maranas CD (2003) Using a residue clash map to functionally characterize protein recombination hybrids. Protein Eng 16(12):1025–1034
Kawashima S, Kanehisa M (2000) AAindex: amino acid index database. Nucleic Acids Res 28(1):374
Stemmer WP (1994) Rapid evolution of a protein in vitro by DNA shuffling. Nature 370(6488):389–391
Crameri A, Raillard SA, Bermudez E, Stemmer WPC (1998) DNA shuffling of a family of genes from diverse species accelerates directed evolution. Nature 391:288–291
Fox R (2005) Directed molecular evolution by machine learning and the influence of nonlinear interactions. J Theor Biol 234(2):187–199
Fox R, Roy A, Govindarajan S, Minshull J, Gustafsson C, Jones JT, Emig R (2003) Optimizing the search algorithm for protein engineering by directed evolution. Protein Eng 16(8):589–597
Ma SK, Gruber J, Davis C, Newman L, Gray D, Wang A, Grate J, Huisman GW, Sheldon RA (2010) A green-by-design biocatalytic process for atorvastatin intermediate. Green Chem 12(1):81–86
Liang J, Mundorff E, Voladri R, Jenne S, Gilson L, Conway A, Krebber A, Wong J, Huisman G, Truesdell S, Lalonde J (2010) Highly enantioselective reduction of a small heterocyclic ketone: biocatalytic reduction of tetrahydrothiophene-3-one to the corresponding (R)-alcohol. Org Process Res Dev 14(1):188–192
Gooding OW, Voladri R, Bautista A, Hopkins T, Huisman G, Jenne S, Ma S, Mundorff EC, Savile MM (2010) Development of a practical biocatalytic process for (R)-2-methylpentanol. Org Process Res Dev 14(1):119–126
Savile CK, Janey JM, Mundorff EC, Moore JC, Tam S, Jarvis WR, Colbeck JC, Krebber A, Fleitz FJ, Brands J, Devine PN, Huisman GW, Hughes GJ (2010) Biocatalytic asymmetric synthesis of chiral amines from ketones applied to sitagliptin manufacture. Science 329(5989):305–309
Thayer AM (2006) Competitors want to get a piece of lipitor. Chem Eng News 84(33):26–27
Soskine M, Tawfik DS (2010) Mutational effects and the evolution of new protein functions. Nat Rev Genet 11(8):572–582
Tokuriki N, Tawfik DS (2009) Stability effects of mutations and protein evolvability. Curr Opin Struct Biol 19(5):596–604
Gumulya Y, Sanchis J, Reetz MT (2012) Many pathways in laboratory evolution can lead to improved enzymes: how to escape from local minima. Chembiochem 13(7):1060–1066
Liang F, Feng XJ, Lowry M, Rabitz H (2005) Maximal use of minimal libraries through the adaptive substituent reordering algorithm. J Phys Chem B 109(12):5842–5854
Shenvi N, Geremia JM, Rabitz H (2003) Substituent ordering and interpolation in molecular library optimization. J Phys Chem A 107(12):2066–2074
McAllister SR, Feng XJ, DiMaggio PA, Floudas CA, Rabinowitz JD, Rabitz H (2008) Descriptor-free molecular discovery in large libraries by adaptive substituent reordering. Bioorg Med Chem Lett 18(22):5967–5970
Faber K (2011) Biotransformations in organic chemistry. A textbook, 6th edn. Springer, Berlin
Sanchis J, Fernandez L, Carballeira J, Drone J, Gumulya Y, Hobenreich H, Kahakeaw D, Kille S, Lohmer R, Peyralans J, Podtetenieff J, Prasad S, Soni P, Taglieber A, Wu S, Zilly F, Reetz M (2008) Improved PCR method for the creation of saturation mutagenesis libraries in directed evolution: application to difficult-to-amplify templates. Appl Microbiol Biotechnol 81(2):387–397
Martin LC, Gloor GB, Dunn SD, Wahl LM (2005) Using information theory to search for co-evolving residues in proteins. Bioinformatics 21(22):4116–4124
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Zaugg, J., Gumulya, Y., Gillam, E.M.J., Bodén, M. (2014). Computational Tools for Directed Evolution: A Comparison of Prospective and Retrospective Strategies. In: Gillam, E., Copp, J., Ackerley, D. (eds) Directed Evolution Library Creation. Methods in Molecular Biology, vol 1179. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1053-3_21
Download citation
DOI: https://doi.org/10.1007/978-1-4939-1053-3_21
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-1052-6
Online ISBN: 978-1-4939-1053-3
eBook Packages: Springer Protocols