Lynx: A Programmatic SAT Solver for the RNA-Folding Problem
Abstract
This paper introduces Lynx, an incremental programmatic SAT solver that allows non-expert users to introduce domain-specific code into modern conflict-driven clause-learning (CDCL) SAT solvers, thus enabling users to guide the behavior of the solver.
The key idea of Lynx is a callback interface that enables non-expert users to specialize the SAT solver to a class of Boolean instances. The user writes specialized code for a class of Boolean formulas, which is periodically called by Lynx’s search routine in its inner loop through the callback interface. The user-provided code is allowed to examine partial solutions generated by the solver during its search, and to respond by adding CNF clauses back to the solver dynamically and incrementally. Thus, the user-provided code can specialize and influence the solver’s search in a highly targeted fashion. While the power of incremental SAT solvers has been amply demonstrated in the SAT literature and in the context of DPLL(T), it has not been previously made available as a programmatic API that is easy to use for non-expert users. Lynx’s callback interface is a simple yet very effective strategy that addresses this need.
We demonstrate the benefits of Lynx through a case-study from computational biology, namely, the RNA secondary structure prediction problem. The constraints that make up this problem fall into two categories: structural constraints, which describe properties of the biological structure of the solution, and energetic constraints, which encode quantitative requirements that the solution must satisfy. We show that by introducing structural constraints on-demand through user provided code we can achieve, in comparison with standard SAT approaches, upto 30x reduction in memory usage and upto 100x reduction in time.
Keywords
Boolean Formula Baseline Approach Callback Function Pseudoknotted Structure Presburger ArithmeticPreview
Unable to display preview. Download preview PDF.
References
- 1.Pseudobase RNA sequence, Most widely used database for research on RNA sequences with Psuedoknots. website, http://pseudobaseplusplus.utep.edu/
- 2.SMTLIB website, http://combination.cs.uiowa.edu/smtlib/
- 3.Biere, A., Heule, M.J.H., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications, vol. 185. IOS Press (February 2009)Google Scholar
- 4.Bon, M., Vernizzi, G., Orland, H., Zee, A.: Topological classification of RNA structures. J. Mol. Biol. 379(4), 900–911 (2008)CrossRefGoogle Scholar
- 5.Brummayer, R., Biere, A.: Effective Bit-Width and Under-Approximation. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2009. LNCS, vol. 5717, pp. 304–311. Springer, Heidelberg (2009)CrossRefGoogle Scholar
- 6.Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided abstraction refinement for symbolic model checking. J. ACM 50(5), 752–794 (2003)MathSciNetCrossRefGoogle Scholar
- 7.Condon, A., Davy, B., Rastegari, B., Chao, S., Tarrant, F.: Classifying RNA pseudoknotted structures. Theoretical Computer Science 320, 35–50 (2004)MathSciNetMATHCrossRefGoogle Scholar
- 8.Do, C., Woods, D., Batzoglou, S.: CONTRAfold: RNA secondary structure prediction without energy-based models. Bioinformatics 22(14), e90–e98 (2006)CrossRefGoogle Scholar
- 9.Een, N., Sorensson, N.: An extensible SAT-solver. In: Proc. Sixth International Conference on Theory and Applications of Satisfiability Testing, pp. 78–92 (May 2003)Google Scholar
- 10.Ganesh, V., Dill, D.L.: A Decision Procedure for Bit-Vectors and Arrays. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 11.Ganzinger, H., Hagen, G., Nieuwenhuis, R., Oliveras, A., Tinelli, C.: DPLL(T): Fast Decision Procedures. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 175–188. Springer, Heidelberg (2004)CrossRefGoogle Scholar
- 12.Knudsen, B., Hein, J.: RNA secondary structure prediction using stochasatic context-free grammars and evolutionary history. Bioinformatics 15, 446–454 (1999)CrossRefGoogle Scholar
- 13.Kroning, D., Ouaknine, J., Seshia, S.A., Strichman, O.: Abstraction-Based Satisfiability Solving of Presburger Arithmetic. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 308–320. Springer, Heidelberg (2004)CrossRefGoogle Scholar
- 14.Lyngsø, R.B., Pedersen, C.N.S.: Pseudoknots in RNA secondary structures. In: Proc. Computational Molecular Biology, RECOMB 2000, pp. 201–209. ACM (2000)Google Scholar
- 15.Mathews, D., Disney, M., Childs, J., Schroeder, S., Zuker, M., Turner, D.: Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc. Natl. Acad. Sci. 101, 7287–7292 (2004)CrossRefGoogle Scholar
- 16.Mathews, D.H., Sabina, J., Zuker, M., Turner, D.H.: Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. Journal of Molecular Biology 288(5), 911–940 (1999)CrossRefGoogle Scholar
- 17.Mathews, D.H., Turner, D.H.: Prediction of RNA secondary structure by free energy. Curr. Opin. Struct. Biol. 16, 270–278 (2006)CrossRefGoogle Scholar
- 18.Ohrimenko, O., Stuckey, P.J., Codish, M.: Propagation = Lazy Clause Generation. In: Bessière, C. (ed.) CP 2007. LNCS, vol. 4741, pp. 544–558. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 19.Parisien, M., Major, F.: The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature 452, 51–55 (2008)CrossRefGoogle Scholar
- 20.Poolsap, U., Kato, Y., Akutsu, T.: Prediction of RNA secondary structure with pseudoknots using integer programming. BMC Bioinformatics 10, S38 (2009)CrossRefGoogle Scholar
- 21.Ren, J., Rastegari, B., Condon, A., Hoos, H.H.: HotKnots: Heuristic prediction of RNA secondary structures including pseudoknots. RNA 11, 1494–1504 (2005)CrossRefGoogle Scholar
- 22.Rivas, E., Eddy, S.: A dynamic programming algorithm for RNA structure prediction including pseudoknots. J. Mol. Biol. 285, 2053–2068 (1999)CrossRefGoogle Scholar
- 23.Soos, M., Nohl, K., Castelluccia, C.: Extending SAT Solvers to Cryptographic Problems. In: Kullmann, O. (ed.) SAT 2009. LNCS, vol. 5584, pp. 244–257. Springer, Heidelberg (2009)CrossRefGoogle Scholar
- 24.Staple, D.W., Butcher, S.E.: Pseudoknots: RNA structures with diverse functions. PLoS Biol. 3(6), e213 (2005)CrossRefGoogle Scholar
- 25.Washietl, S., Hofacker, I., Lukasser, M., Hüttenhofer, A., Stadler, P.: Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome. Nat. Biotechnol. 23(11), 1383–1390 (2005)CrossRefGoogle Scholar
- 26.Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 9(1), 133–148 (1981)CrossRefGoogle Scholar