Abstract
This paper proposes a novel tree decomposition based side-chain assignment algorithm, which can obtain the globally optimal solution of the side-chain packing problem very efficiently. Theoretically, the computational complexity of this algorithm is O((N + M)n \(_{rot}^{tw + 1}\)) where N is the number of residues in the protein, M the number of interacting residue pairs, n rot the average number of rotamers for each residue and \(O((N + M)n^{tw+1}_{rot})\) the tree width of the residue interaction graph. Based on this algorithm, we have developed a side-chain prediction program SCATD (Side Chain Assignment via Tree Decomposition). Experimental results show that after the Goldstein DEE is conducted, n rot is around 3.5, tw is only 3 or 4 for most of the test proteins in the SCWRL benchmark and less than 10 for all the test proteins. SCATD runs up to 90 times faster than SCWRL 3.0 on some large proteins in the SCWRL benchmark and achieves an average of five times faster speed on all the test proteins. If only the post-DEE stage is taken into consideration, then our tree-decomposition based energy minimization algorithm is more than 200 times faster than that in SCWRL 3.0 on some large proteins. SCATD is freely available for academic research upon request.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rost, B.: TOPITS: Threading one-dimensional predictions into three-dimensional structures. In: Rawlings, C., Clark, D., Altman, R., Hunter, L., Lengauer, T., Wodak, S. (eds.) Third International Conference on Intelligent Systems for Molecular Biology, pp. 314–321. AAAI Press, Cambridge (1995)
Xu, Y., Xu, D., Uberbacher, E.: An efficient computational method for globally optimal threadings. Journal of Computational Biology 5, 597–614 (1998)
Kim, D., Xu, D., Guo, J., Ellrott, K., Xu, Y.: PROSPECT II: Protein structure prediction method for genome-scale applications. Protein Engineering 16, 641–650 (2003)
Jones, D.: GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences. Journal of Molecular Biology 287, 797–815 (1999)
Kelley, L., MacCallum, R., Sternberg, M.: Enhanced genome annotation using structural profiles in the program 3D-PSSM. Journal of Molecular Biology 299, 499–520 (2000)
Alexandrov, N., Nussinov, R., Zimmer, R.: Fast protein fold recognition via sequence to structure alignment and contact capacity potentials. In: Biocomputing: Proceedings of 1996 Pacific Symposium (1996)
von Ohsen, N., Sommer, I., Zimmer, R., Lengauer, T.: Arby: automatic protein structure prediction using profile-profile alignment and confidence measures. Bioinformatics 20, 2228–2235 (2004)
Shi, J., Tom, L.B., Kenji, M.: FUGUE: Sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. Journal of Molecular Biology 310, 243–257 (2001)
Lathrop, R., Smith, T.: A branch-and-bound algorithm for optimal protein threading with pairwise (contact potential) amino acid interactions. In: Proceedings of the 27th Hawaii International Conference on System Sciences. IEEE Computer Society Press, Los Alamitos (1994)
Akutsu, T., Miyano, S.: On the approximation of protein threading. Theoretical Computer Science 210, 261–275 (1999)
Li, W., Pio, F., Pawlowski, K., Godzik, A.: Saturated BLAST: detecting distant homology using automated multiple intermediate sequence BLAST search. Bioinformatics 16, 1105–1110 (2000)
Summers, N., Karplus, M.: Construction of side-chains in homology modelling: Application to the c-terminal lobe of rhizopuspepsin. Journal of Molecular Biology 210, 785–811 (1989)
Holm, L., Sander, C.: Database algorithm for generating protein backbone and sidechain coordinates from a C α trace: Application to model building and detection of coordinate errors. Journal of Molecular Biology 218, 183–194 (1991)
Desmet, J., Maeyer, M.D., Hazes, B., Laster, I.: The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356, 539–542 (1992)
Desmet, J., Spriet, J., Laster, I.: Fast and accurate side-chain topology and energy refinement (faster) as a new method for protein structure optimization. Protein: Structure, Function and Genetics 48, 31–43 (2002)
Dunbrack Jr., R.L.: Comparative modeling of CASP3 targets using PSI-BLAST and SCWRL. Protein: Structure, Function and Genetics 3, 81–87 (1999)
Canutescu, A., Shelenkov, A., Dunbrack Jr., R.L.: A graph-theory algorithm for rapid protein side-chain prediction. Protein Science 12, 2001–2014 (2003)
Samudrala, R., Moult, J.: Determinants of side chain conformational preferences in protein structures. Protein Engineering 11, 991–997 (1998)
Xiang, Z., Honig, B.: Extending the accuracy limits of prediction for side-chain conformations. Journal of Molecular Biology 311, 421–430 (2001)
Bower, M., Cohen, F., Dunbrack Jr., R.L.: Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: A new homology modeling tool. Journal of Molecular Biology 267, 1268–1282 (1997)
Liang, S., Grishin, N.: side-chain modelling with an optimized scoring function. Protein Science 11, 322–331 (2002)
Hong, E., Lozano-Perez, T.: Protein side-chain placement: probabilistic inference and integer programming methods. Technical report, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (2004)
Chazelle, B., Kingsford, C., Singh, M.: A semidefinite programming approach to side-chain positioning with new rounding strategies. Informs Journal on Computing, Special Issue in Computational Molecular Biology/Bioinformatics, 86–94 (2004)
Kingsford, C.L., Chazelle, B., Singh, M.: Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics (2004)
Eriksson, O., Zhou, Y., Elofsson, A.: Side chain-positioning as an integer programming problem. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 128–141. Springer, Heidelberg (2001)
Dukka, K., Tomita, E., Suzuki, J., Akutsu, T.: Protein side-chain packing problem: a maximum common edge-weight clique algorithmic approach. In: The Second Asia Pacific Bioinformatics Conference (2004)
Althaus, E., Kohlbacher, O., Lenhof, H., Müller, P.: A branch and cut algorithm for the optimal solution of the side-chain placement problem. Technical Report MPI-I-2000-1-001, Max-Planck-Institute für Informatik (2000)
Dunbrack Jr., R.L., Karplus, M.: Backbone-dependent rotamer library for proteins: Application to side-chain prediction. Journal of Molecular Biology 230, 543–574 (1993)
Pierce, N., Winfree, E.: Protein design is NP-hard. Protein Engineering 15, 779–782 (2002)
Akutsu, T.: NP-hardness results for protein side-chain packing. In: Miyano, S., Takagi, T. (eds.) Genome Informatics, vol. 8, pp. 180–186 (1997)
Sali, A., Blundell, T.: Comparative protein modelling by satisfaction of spatial restraints. Journal of Molecular Biology, 779–815 (1993)
Leach, A., Lemon, A.: Exploring the conformational space of protein side chains using dead-end elimination and the A * algorithm. Protein: Structure, Function and Genetics 33, 227–239 (1998)
Goldstein, R.: Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophysical Journal 66, 1335–1340 (1994)
Robertson, N., Seymour, P.: Graph minors. II. algorithmic aspects of tree-width. Journal of Algorithms 7, 309–322 (1986)
Koster, A., van Hoesel, S., Kolen, A.: Solving frequency assignment problems via tree-decomposition. Research Memoranda 036, Maastricht: METEOR, Maastricht Research School of Economics of Technology and Organization (1999), available at http://ideas.repec.org/p/dgr/umamet/1999036.html
Bach, F., Jordan, M.: Thin junction trees. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems (NIPS), vol. 14 (2002)
Arnborg, S., Corneil, D., Proskurowski, A.: Complexity of finding embedding in a k-tree. SIAM Journal on Algebraic and Discrete Methods 8, 277–284 (1987)
Miller, G.L., Teng, S., Thurston, W., Vavasis, S.A.: Separators for sphere-packings and nearest neighbor graphs. Journal of ACM 44, 1–29 (1997)
Berry, A., Heggernes, P., Simonet, G.: The minimum degree heuristic and the minimal triangulation process. In: Bodlaender, H.L. (ed.) WG 2003. LNCS, vol. 2880, pp. 58–70. Springer, Heidelberg (2003)
Kohlbacher, O., Lenhof, H.: BALL - rapid software prototyping in computational molecular biology. Bioinformatics 16, 815–824 (2000)
Arya, S., Mount, D., Netanyahu, N., Silverman, R., Wu, A.: An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. Journal of ACM 45, 891–923 (1998)
Mount, D., Arya, S.: ANN: a library for approximate nearest neighbor searching. In: 2nd CGC Workshop on Computational Geometry (1997)
Amir, E.: Efficient approximation for triangulation of minimum treewdith. In: 17th Conference on Uncertainty in Artificial Intelligence, UAI 2001 (2001)
Xu, J., Li, M., Lin, G., Kim, D., Xu, Y.: Protein threading by linear programming. In: Biocomputing: Proceedings of the 2003 Pacific Symposium, Hawaii, USA, pp. 264–275 (2003)
Xu, J., Li, M., Kim, D., Xu, Y.: RAPTOR: optimal protein threading by linear programming. Journal of Bioinformatics and Computational Biology 1, 95–117 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, J. (2005). Rapid Protein Side-Chain Packing via Tree Decomposition. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2005. Lecture Notes in Computer Science(), vol 3500. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11415770_32
Download citation
DOI: https://doi.org/10.1007/11415770_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25866-7
Online ISBN: 978-3-540-31950-4
eBook Packages: Computer ScienceComputer Science (R0)