Skip to main content

Parallel Computational Protein Design

Part of the Methods in Molecular Biology book series (MIMB,volume 1529)

Abstract

Computational structure-based protein design (CSPD) is an important problem in computational biology, which aims to design or improve a prescribed protein function based on a protein structure template. It provides a practical tool for real-world protein engineering applications. A popular CSPD method that guarantees to find the global minimum energy solution (GMEC) is to combine both dead-end elimination (DEE) and A* tree search algorithms. However, in this framework, the A* search algorithm can run in exponential time in the worst case, which may become the computation bottleneck of large-scale computational protein design process. To address this issue, we extend and add a new module to the OSPREY program that was previously developed in the Donald lab (Gainza et al., Methods Enzymol 523:87, 2013) to implement a GPU-based massively parallel A* algorithm for improving protein design pipeline. By exploiting the modern GPU computational framework and optimizing the computation of the heuristic function for A* search, our new program, called gOSPREY, can provide up to four orders of magnitude speedups in large protein design cases with a small memory overhead comparing to the traditional A* search algorithm implementation, while still guaranteeing the optimality. In addition, gOSPREY can be configured to run in a bounded-memory mode to tackle the problems in which the conformation space is too large and the global optimal solution cannot be computed previously. Furthermore, the GPU-based A* algorithm implemented in the gOSPREY program can be combined with the state-of-the-art rotamer pruning algorithms such as iMinDEE (Gainza et al., PLoS Comput Biol 8:e1002335, 2012) and DEEPer (Hallen et al., Proteins 81:18–39, 2013) to also consider continuous backbone and side-chain flexibility.

Key words

  • Protein design
  • A*
  • Dead-end elimination
  • GPGPU
  • CUDA
  • Parallel computing

This is a preview of subscription content, access via your institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4939-6637-0_13
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-1-4939-6637-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Hardcover Book
USD   199.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Roberts KE, Cushing PR, Boisguerin P, Madden DR, Donald BR (2012) Computational design of a PDZ domain peptide inhibitor that rescues CFTR activity. PLoS Comput Biol 8(4):e1002477

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  2. Gorczynski MJ, Grembecka J, Zhou Y, Kong Y, Roudaia L, Douvas MG, Newman M, Bielnicka I, Baber G, Corpora T, Shi J, Sridharan M, Lilien R, Donald BR, Speck NA, Brown ML, Bushweller JH (2007) Allosteric inhibition of the protein-protein interaction between the leukemia-associated proteins Runx1 and CBF$\beta$. Chem Biol 14(10):1186–1197

    CAS  CrossRef  PubMed  Google Scholar 

  3. Frey KM, Georgiev I, Donald BR, Anderson AC (2010) Predicting resistance mutations using protein design algorithms. Proc Natl Acad Sci 107(31):13707–13712

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  4. Reeve SM, Gainza P, Frey KM, Georgiev I, Donald BR, Anderson AC (2015) Protein design algorithms predict viable resistance to an experimental antifolate. Proc Natl Acad Sci 112(3):749–754

    CAS  CrossRef  PubMed  Google Scholar 

  5. Reardon PN, Sage H, Dennison SM, Martin JW, Donald BR, Alam SM, Haynes BF, Spicer LD (2014) Structure of an HIV-1-neutralizing antibody target, the lipid-bound gp41 envelope membrane proximal region trimer. Proc Natl Acad Sci 111(4):1391–1396

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  6. Rudicell RS, Kwon YD, Ko S-Y, Pegu A, Louder MK, Georgiev IS, Wu X, Zhu J, Boyington JC, Chen X, Shi W, Yang ZY, Doria-Rose NA, McKee K, O'Dell S, Schmidt SD, Chuang GY, Druz A, Soto C, Yang Y, Zhang B, Zhou T, Todd JP, Lloyd KE, Eudailey J, Roberts KE, Donald BR, Bailer RT, Ledgerwood J, NISC Comparative Sequencing Program, Mullikin JC, Shapiro L, Koup RA, Graham BS, Nason MC, Connors M, Haynes BF, Rao SS, Roederer M, Kwong PD, Mascola JR, Nabel GJ (2014) Enhanced potency of a broadly neutralizing HIV-1 antibody in vitro improves protection against lentiviral infection in vivo. J Virol 88(21):12669–12682

    CrossRef  PubMed  PubMed Central  Google Scholar 

  7. Chen C-Y, Georgiev I, Anderson AC, Donald BR (2009) Computational structure-based redesign of enzyme activity. Proc Natl Acad Sci 106(10):3764–3769

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  8. Chazelle B, Kingsford C, Singh M (2004) A semidefinite programming approach to side chain positioning with new rounding strategies. INFORMS J Comput 16(4):380–392

    CrossRef  Google Scholar 

  9. Pierce NA, Winfree E (2002) Protein design is NP-hard. Protein Eng 15(10):779–782

    CAS  CrossRef  PubMed  Google Scholar 

  10. Kuhlman B, Baker D (2000) Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci 97(19):10383–10388

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  11. Marvin JS, Hellinga HW (2001) Conversion of a maltose receptor into a zinc biosensor by computational design. Proc Natl Acad Sci 98(9):4955–4960

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  12. Shah PS, Hom GK, Mayo SL (2004) Preprocessing of rotamers for protein design calculations. J Comput Chem 25(14):1797–1800

    CAS  CrossRef  PubMed  Google Scholar 

  13. Street AG, Mayo SL (1999) Computational protein design. Structure 7(5):R105–R109

    CAS  CrossRef  PubMed  Google Scholar 

  14. Kingsford CL, Chazelle B, Singh M (2005) Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics 21(7):1028–1039

    CAS  CrossRef  PubMed  Google Scholar 

  15. Hong E-J, Lippow SM, Tidor B, Lozano-Pérez T (2009) Rotamer optimization for protein design through MAP estimation and problem-size reduction. J Comput Chem 30(12):1923–1945

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  16. Zhou Y, Wu Y, Zeng J (2015) Computational protein design using AND/OR branch-and-bound search. J Comput Biol 23(6):439–451. doi:10.1089/cmb.2015.0212

    Google Scholar 

  17. Xu J, Berger B (2006) Fast and accurate algorithms for protein side-chain packing. JACM) 53(4):533–557

    CrossRef  Google Scholar 

  18. Gainza P, Roberts KE, Donald BR (2012) Protein design using continuous rotamers. PLoS Comput Biol 8(1), e1002335

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  19. Desmet J, Maeyer MD, Hazes B, Lasters I (1992) The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356(6369):539–542

    CAS  CrossRef  PubMed  Google Scholar 

  20. Gainza P, Roberts KE, Georgiev I, Lilien RH, Keedy DA, Chen C-Y, Reza F, Anderson AC, Richardson DC, Richardson JS, Donald BR (2013) OSPREY: protein design with ensembles, flexibility, and provable algorithms. Methods Enzymol 523:87

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  21. Leach AR, Lemon AP (1998) Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins 33(2):227–239

    CAS  CrossRef  PubMed  Google Scholar 

  22. Lippow SM, Tidor B (2007) Progress in computational protein design. Curr Opin Biotechnol 18(4):305–311

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  23. Donald BR (2011) Algorithms in structural molecular biology. The MIT Press, Cambridge, MA

    Google Scholar 

  24. Georgiev I, Lilien RH, Donald BR (2006) Improved pruning algorithms and divide-and-conquer strategies for dead-end elimination, with application to protein design. Bioinformatics 22(14):e174–e183

    CAS  CrossRef  PubMed  Google Scholar 

  25. Georgiev I, Lilien RH, Donald BR (2008) The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles. J Comput Chem 29(10):1527–1542

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  26. Gregg C, Hazelwood K (2011) Where is the data? Why you cannot debate CPU vs. GPU performance without the answer. In: 2011 I.E. International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 134–144

    Google Scholar 

  27. Lee VW, Kim C, Chhugani J, Deisher M, Kim D, Nguyen AD, Satish N, Smelyanskiy M, Chennupaty S, Hammarlund P, Singhal R, Dubey P (2010) Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. ACM SIGARCH Comput Archit News 38(3):451–460

    CrossRef  Google Scholar 

  28. Hart PE, Nilsson NJ, Raphael B (1968) A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst Sci Cybern 4(2):100–107

    CrossRef  Google Scholar 

  29. Zhou Y, Xu W, Donald BR, Zeng J (2014) An efficient parallel algorithm for accelerating computational protein design. Bioinformatics 30(12):i255–i263

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  30. Hallen MA, Keedy DA, Donald BR (2013) Dead-end elimination with perturbations (DEEPer): a provable protein design algorithm with continuous sidechain and backbone flexibility. Proteins 81(1):18–39

    CAS  CrossRef  PubMed  Google Scholar 

Download references

Acknowledgments

This work is supported in part by the National Basic Research Program of China Grant 2011CBA00300, 2011CBA00301, the National Natural Science Foundation of China Grant 61033001, 61361136003. This work is supported by a grant to B.R.D. from the National Institutes of Health (R01 GM-78031).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianyang Zeng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this protocol

Cite this protocol

Zhou, Y., Donald, B.R., Zeng, J. (2017). Parallel Computational Protein Design. In: Samish, I. (eds) Computational Protein Design. Methods in Molecular Biology, vol 1529. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6637-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-6637-0_13

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-6635-6

  • Online ISBN: 978-1-4939-6637-0

  • eBook Packages: Springer Protocols