# Combinatorial Problems on Strings with Applications to Protein Folding

## Abstract

We consider the problem of protein folding in the HP model on the 3D square lattice. This problem is combinatorially equivalent to folding a string of 0’s and 1’s so that the string forms a self-avoiding walk on the lattice and the number of adjacent pairs of 1’s is maximized. The previously best-known approximation algorithm for this problem has a guarantee of \(\frac{3}{8}=.375\) [HI95]. In this paper, we first present a new \(\frac{3}{8}\)-approximation algorithm for the 3D folding problem that improves on the absolute approximation guarantee of the previous algorithm. We then show a connection between the 3D folding problem and a basic combinatorial problem on binary strings, which may be of independent interest. Given a binary string in { *a*,*b* }^{*}, we want to find a long subsequence of the string in which every sequence of consecutive *a*’s is followed by at least as many consecutive *b*’s. We show a non-trivial lower-bound on the existence of such subsequences. Using this result, we obtain an algorithm with a slightly improved approximation ratio of at least .37501 for the 3D folding problem. All of our algorithms run in linear time.

## Keywords

Lattice Point Combinatorial Problem Binary String Linear Time Algorithm Input String## Preview

Unable to display preview. Download preview PDF.

## References

- [BL98]Berger, B., Leighton, T.: Protein Folding in the Hydrophobic-Hydrophilic (HP) Model is NP-Complete. In: Proceedings of the 2nd Conference on Computational Molecular Biology, RECOMB (1998)Google Scholar
- [CGP+98]Crescenzi, P., Goldman, D., Papadimitriou, C., Piccolboni, A., Yannakakis, M.: On the Complexity of Protein Folding. In: Proceedings of the 2nd Conference on Computational Molecular Biology, RECOMB (1998)Google Scholar
- [Dil85]Dill, K.A.: Theory for the Folding and Stability of Globular Proteins. Biochemistry 24, 1501 (1985)CrossRefGoogle Scholar
- [Dil90]Dill, K.A.: Dominant Forces in Protein Folding. Biochemistry 29, 7133–7155 (1990)CrossRefGoogle Scholar
- [HI95]Hart, W.E., Istrail, S.: Fast Protein Folding in the Hydrophobic-hydrophilic Model within Three-eighths of Optimal. In: Proceedings of the 27th ACM Symposium on the Theory of Computing, STOC (1995)Google Scholar
- [New02]Newman, A.: A New Algorithm for Protein Folding in the HP Model. In: Proceedings of the 13th ACM-SIAM Symposium on Discrete Algorithms, SODA (2002)Google Scholar