An exact algorithm for side-chain placement in protein design

Abstract

Computational protein design aims at constructing novel or improved functions on the structure of a given protein backbone and has important applications in the pharmaceutical and biotechnical industry. The underlying combinatorial side-chain placement (SCP) problem consists of choosing a SCP for each residue position such that the resulting overall energy is minimum. The choice of the side-chain then also determines the amino acid for this position. Many algorithms for this \({\mathcal{NP}}\)-hard problem have been proposed in the context of homology modeling, which, however, reach their limits when faced with large protein design instances. In this paper, we propose a new exact method for the SCP problem that works well even for large instance sizes as they appear in protein design. Our main contribution is a dedicated branch-and-bound algorithm that combines tight upper and lower bounds resulting from a novel Lagrangian relaxation approach for SCP. Our experimental results show that our method outperforms alternative state-of-the-art exact approaches and makes it possible to optimally solve large protein design instances routinely.

This is a preview of subscription content, access via your institution.

References

  1. 1

    Althaus E., Kohlbacher O., Lenhof H.P., Müller P.: A combinatorial approach to protein docking with flexible side chains. J. Comput. Biol. 9(4), 597–612 (2002)

    Article  Google Scholar 

  2. 2

    Applegate, D., Bixby, R., Chvátal, V., Cook, W.: Finding cuts in the TSP. Tech. Rep. 95–05, DIMACS (1995)

  3. 3

    Bloom J.D., Meyer M.M., Meinhold P., Otey C.R., MacMillan D., Arnold F.H.: Evolving strategies for enzyme engineering. Curr. Opin. Struct. Biol. 15(4), 447–452 (2005)

    Article  Google Scholar 

  4. 4

    Canutescu A.A., Shelenkov A.A., Dunbrack R.L.: A graph-theory algorithm for rapid protein side-chain prediction. Protein. Sci. 12(9), 2001–2014 (2003)

    Article  Google Scholar 

  5. 5

    Chazelle B., Kingsford C., Singh M.: A semidefinite programming approach to side chain positioning with new rounding strategies. INFORMS J. Comput. 16(4), 380–392 (2004)

    Article  MathSciNet  Google Scholar 

  6. 6

    Dantas G., Corrent C., Reichow S., Havranek J., Eletr Z., Isern N., Kuhlman B., Varani G., Merritt E., Baker D.: High-resolution structural and thermodynamic analysis of extreme stabilization of human procarboxypeptidase by computational protein design. J. Mol. Biol. 366(4), 1209–1221 (2007)

    Article  Google Scholar 

  7. 7

    Dantas G., Kuhlman B., Callender D., Wong M., Baker D.: A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. J. Mol. Biol. 332(2), 449–460 (2003)

    Article  Google Scholar 

  8. 8

    Desmet J., Maeyer M.D., Hazes B., Lasters I.: The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356(6369), 539–542 (1992)

    Article  Google Scholar 

  9. 9

    Desmet J., Spriet J., Lasters I.: Fast and accurate side-chain topology and energy refinement (FASTER) as a new method for protein structure optimization. Proteins 48(1), 31–43 (2002)

    Article  Google Scholar 

  10. 10

    Dunbrack R.L.: Rotamer libraries in the 21st century. Curr. Opin. Struct. Biol. 12(4), 431–440 (2002)

    Article  Google Scholar 

  11. 11

    Goldstein R.F.: Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophys. J. 66(5), 1335–1340 (1994)

    Article  Google Scholar 

  12. 12

    Held M., Karp R.: The traveling salesman problem and minimum spanning trees: part II. Math. Progr. 1, 6–25 (1971)

    MATH  Article  MathSciNet  Google Scholar 

  13. 13

    Hildebrandt A., Dehof A.K., Rurainski A., Bertsch A., Schumann M., Toussaint N.C., Moll A., Stöckel D., Nickels S., Mueller S.C., Lenhof H.P., Kohlbacher O.: BALL – biochemical algorithms library 1.3. BMC Bioinformatics 11, 531 (2010)

    Article  Google Scholar 

  14. 14

    Kingsford C.L., Chazelle B., Singh M.: Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics 21(7), 1028–1036 (2005)

    Article  Google Scholar 

  15. 15

    Klau, G.W., et al.: Planet lisa software library. http://planet-lisa.net

  16. 16

    Kuhlman B., Baker D.: Native protein sequences are close to optimal for their structures. Proc. Natl. Acad. Sci. USA 97(19), 10,383–10,388 (2000)

    Article  Google Scholar 

  17. 17

    Leach A.R., Lemon A.P.: Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins 33(2), 227–239 (1998)

    Article  Google Scholar 

  18. 18

    Mehlhorn K., Näher S.: The LEDA platform of combinatorial and geometric computing. Cambridge University Press, Cambridge (1999)

    Google Scholar 

  19. 19

    Pierce N.A., Spriet J.A., Desmet J., Mayo S.L.: Conformational splitting: a more powerful criterion for dead-end elimination. J. Comput. Chem. 21, 999–1009 (2000)

    Article  Google Scholar 

  20. 20

    Pierce N.A., Winfree E.: Protein design is NP-hard. Protein Eng 15(10), 779–782 (2002)

    Article  Google Scholar 

  21. 21

    Shah P.S., Hom G.K., Ross S.A., Lassila J.K., Crowhurst K.A., Mayo S.L.: Full-sequence computational design and solution structure of a thermostable protein variant. J. Mol. Biol. 372(1), 1–6 (2007)

    Article  Google Scholar 

  22. 22

    Sontag, D., Meltzer, T., Globerson, A., Jaakkola, T., Weiss, Y.: Tightening LP relaxations for MAP using message passing. In: D. McAllester, P. Myllymak (eds.) Conference on Uncertainty in Artificial Intelligence. AUAI Press, Corvallis, Oregon (2008)

    Google Scholar 

  23. 23

    Voigt C., Gordon D., Mayo S.: Trading accuracy for speed: a quantitative comparison of search algorithms in protein sequence design. J. Mol. Biol. 299(3), 789–803 (2000)

    Article  Google Scholar 

  24. 24

    Wernisch L., Hery S., Wodak S.: Automatic protein design with all atom force-fields by exact and heuristic optimization. J. Mol. Biol. 301(3), 713–736 (2000)

    Article  Google Scholar 

  25. 25

    Xiang Z., Honig B.: Extending the accuracy limits of prediction for side-chain conformations. J. Mol. Biol. 311(2), 421–430 (2001)

    Article  Google Scholar 

  26. 26

    Xie W., Sahinidis N.V.: Residue-rotamer-reduction algorithm for the protein side-chain conformation problem. Bioinformatics 22((2), 188–194 (2006)

    Article  Google Scholar 

  27. 27

    Xu J., Berger B.: Fast and accurate algorithms for protein side-chain packing. J. ACM. 53(4), 533–557 (2006)

    Article  MathSciNet  Google Scholar 

  28. 28

    Yanover C., Meltzer T., Weiss Y.: Linear programming relaxations and belief propagation—an empirical study. J. Mach. Learn. Res. 7, 1887–1907 (2006)

    MathSciNet  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Gunnar W. Klau.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Canzar, S., Toussaint, N.C. & Klau, G.W. An exact algorithm for side-chain placement in protein design. Optim Lett 5, 393–406 (2011). https://doi.org/10.1007/s11590-011-0308-0

Download citation

Keywords

  • Lagrangian relaxation
  • Branch-and-bound
  • Algorithm engineering
  • Protein design
  • side-chain placement