Symbolic Computation via Program Transformation

Lauko, Henrich; Ročkai, Petr; Barnat, Jiří

doi:10.1007/978-3-030-02508-3_17

Henrich Lauko¹⁵,
Petr Ročkai¹⁵ &
Jiří Barnat¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11187))

Included in the following conference series:

International Colloquium on Theoretical Aspects of Computing

553 Accesses
20 Citations
1 Altmetric

Abstract

Symbolic computation is an important approach in automated program analysis. Most state-of-the-art tools perform symbolic computation as interpreters and directly maintain symbolic data. In this paper, we show that it is feasible, and in fact practical, to use a compiler-based strategy instead. Using compiler tooling, we propose and implement a transformation which takes a standard program and outputs a program that performs a semantically equivalent, but partially symbolic, computation. The transformed program maintains symbolic values internally and operates directly on them; therefore, the program can be processed by a tool without support for symbolic manipulation.

The main motivation for the transformation is in symbolic verification, but there are many other possible use-cases, including test generation and concolic testing. Moreover, using the transformation simplifies tools, since the symbolic computation is handled by the program directly. We have implemented the transformation at the level of LLVM bitcode. The paper includes an experimental evaluation, based on an explicit-state software model checker as a verification backend.

This work has been partially supported by the Czech Science Foundation grant 18-02177S and by Red Hat, Inc.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In fact, any control-explicit, data-symbolic model checker already contains a subroutine (in our case about 200 lines) which effectively implements a symbolic executor.
2.
https://divine.fi.muni.cz/2018/sym.
3.
CIL is short for C Intermediate Language [23], and is a simplified subset of the C language. The toolkit can automatically translate standard C into the intermediate (CIL) form. The CIL form can be optionally brought into the form of three-address code and this feature is used in CREST.
4.
Programs in LLVM are in a partial SSA form, a special case of three-address code [1]. Three-address code is essentially small-step semantics in an executable form.
5.
Additionally, since the SSA portion of the LLVM IR is already statically typed, we can take advantage of this existing type system in the implementation. Nonetheless, the treatment in this section does not depend on LLVM and would be applicable to any dataflow-oriented program representation.
6.
Since multiple abstract domains can co-exist in a single program, we use the lower index \(i\) to distinguish them.
7.
The exact layout of data (structures, arrays, dynamic memory) is normally the responsibility of the program itself, more so in the case of intermediate or low-level languages. For this reason, it is often the case that the program will make various assumptions about relationships among addresses within the same memory object. It is impractical, if not impossible, to automatically adapt the program to a different data layout, e.g. in case the size of a scalar value would change due to abstraction.
8.
The only way a value can be copied from one memory address to another is via a load instruction followed by a store, both of which are instrumented and as such also transfer the supplementary data.
9.
For instance, concrete operands to abstract operations are lifted, arguments to necessarily concrete functions (e.g. real system calls) are lowered. Memory stores are replaced with freeze and loads with thaw.
10.
The names lift and lower allude to the relationship of the abstract and the concrete domain. In applications with multiple abstract domains, it may be expedient to include additional instructions that convert directly from one abstract domain to another, although in theory it is always possible to go through the concrete domain.
11.
Again, this is true of LLVM bitcode – it is already in a partial SSA form. This simplifies our prototype implementation somewhat.
12.
An abstract domain is called relational when it is capable of preserving information about relationships among various abstract values that appear in the program.
13.
In the present paper, we only deal with abstract (symbolic) values. The structure of the program state, that is, the arrangement of the program memory, is taken to be always represented explicitly, i.e., it belongs squarely to the concrete domain.
14.
We have used the utility sloccount to get estimates of module size in terms of lines of code.

References

Aho, A.V.: Compilers: Principles, Techniques, and Tools. Addison-Wesley Series in Computer Science. Pearson/Addison Wesley, Boston (2007)
Google Scholar
Albarghouthi, A., Gurfinkel, A., Chechik, M.: From under-approximations to over-approximations and back. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 157–172. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28756-5_12
Chapter MATH Google Scholar
Barrett, C., Fontaine, P., Tinelli, C.: SMT-LIB: the satisfiability modulo theories library. http://www.smt-lib.org/
Bauch, P., Havel, V., Barnat, J.: Control explicit-data symbolic model checking. ACM Trans. Softw. Eng. Methodol. 25(2) (2016). Article no. 15. https://doi.org/10.1145/2888393
Article Google Scholar
Beyer, D.: Reliable and reproducible competition results with BenchExec and witnesses (report on SV-COMP 2016). In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 887–904. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49674-9_55
Chapter Google Scholar
Beyer, D., Keremoglu, M.E.: CPAchecker: a tool for configurable software verification. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 184–190. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_16
Chapter Google Scholar
Beyer, D., Löwe, S.: Interpolation for value analysis. In: Aßmann, U., Demuth, B., Spitta, T., Püschel, G., Kaiser, R. (eds.) Software Engineering and Management. Lecture Notes in Informatics, vol. 239, pp. 73–74. Gesellschaft für Informatik, Bonn (2015). https://dl.gi.de/handle/20.500.12116/2495
Beyer, B., Henzinger, T.A., Jhala, R., Majumdar, R.: The software model checker Blast. Int. J. Softw. Tools Technol. Transfer 9(5), 505–525 (2007). https://doi.org/10.1007/s10009-007-0044-z
Article Google Scholar
Burnim, J., Sen, K.: Heuristics for scalable dynamic test generation. In: Proceedings of 23rd IEEE/ACM International Conference on Automated Software Engineering, ASE 2008, L’Aquila, September 2008, pp. 443–446. IEEE CS Press, Washington, DC (2008). https://doi.org/10.1109/ase.2008.69
Cadar, C., Dunbar, D., Engler, D.R.: KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of 8th USENIX Symposium on Operating Systems Design and Implementation, San Diego, CA, December 2008, pp. 209–224. USENIX Association (2008). http://www.usenix.org/events/osdi08/tech/full_papers/cadar/cadar.pdf
Cavada, R., et al.: The nuXmv symbolic model checker. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 334–342. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08867-9_22
Chapter Google Scholar
Chalupa, M., Vitovská, M., Jonáš, M., Slaby, J., Strejček, J.: Symbiotic 4: beyond reachability. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10206, pp. 385–389. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54580-5_28
Chapter Google Scholar
Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided abstraction refinement. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 154–169. Springer, Heidelberg (2000). https://doi.org/10.1007/10722167_15
Chapter Google Scholar
Daniel, J., Parízek, P.: PANDA: simultaneous predicate abstraction and concrete execution. In: Piterman, N. (ed.) HVC 2015. LNCS, vol. 9434, pp. 87–103. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26287-1_6
Chapter Google Scholar
Ganesh, V., Dill, D.L.: A decision procedure for bit-vectors and arrays. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73368-3_52
Chapter Google Scholar
Havelund, K., Pressburger, T.: Model checking JAVA programs using JAVA PathFinder. Int. J. Softw. Tools Technol. Transfer 2(4), 366–381 (2000). https://doi.org/10.1007/s100090050043
Article MATH Google Scholar
Khurshid, S., Păsăreanu, C.S., Visser, W.: Generalized symbolic execution for model checking and testing. In: Garavel, H., Hatcliff, J. (eds.) TACAS 2003. LNCS, vol. 2619, pp. 553–568. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36577-X_40
Chapter MATH Google Scholar
King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976). https://doi.org/10.1145/360248.360252
Article MathSciNet MATH Google Scholar
Kroening, D., Tautschnig, M.: CBMC – C bounded model checker. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 389–391. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54862-8_26
Chapter Google Scholar
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis and transformation. In: Proceedings of 2nd IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2004, Palo Alto, CA, March 2004, pp. 75–88. IEEE CS Press, Washington, DC (2004). https://doi.org/10.1109/cgo.2004.1281665
McMillan, K.L.: Symbolic Model Checking. Kluwer Academic Publishers, Boston (1993). https://doi.org/10.1007/978-1-4615-3190-6
Book MATH Google Scholar
Mrázek, J., Bauch, P., Lauko, H., Barnat, J.: SymDIVINE: tool for control-explicit data-symbolic state space exploration. In: Bošnački, D., Wijs, A. (eds.) SPIN 2016. LNCS, vol. 9641, pp. 208–213. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32582-8_14
Chapter Google Scholar
Necula, G.C., McPeak, S., Rahul, S.P., Weimer, W.: CIL: intermediate language and tools for analysis and transformation of C programs. In: Horspool, R.N. (ed.) CC 2002. LNCS, vol. 2304, pp. 213–228. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45937-5_16
Chapter Google Scholar
Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-662-03811-6
Book MATH Google Scholar
Sen, K., Agha, G.: CUTE and jCUTE: concolic unit testing and explicit path model-checking tools. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 419–423. Springer, Heidelberg (2006). https://doi.org/10.1007/11817963_38
Chapter Google Scholar
Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for C. In: Proceedings of Joint 10th European Software Engineering Conference and 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE 2005, Lisbon, September 2005, pp. 263–272. ACM Press, New York (2005). https://doi.org/10.1145/1081706.1081750
Sousa, M., Rodríguez, C., D’Silva, V., Kroening, D.: Abstract interpretation with unfoldings. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 197–216. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_11
Chapter Google Scholar
Weißenbacher, G.: Program analysis with interpolants. Ph.D. thesis, University of Oxford (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Henrich Lauko, Petr Ročkai & Jiří Barnat

Authors

Henrich Lauko
View author publications
You can also search for this author in PubMed Google Scholar
Petr Ročkai
View author publications
You can also search for this author in PubMed Google Scholar
Jiří Barnat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henrich Lauko .

Editor information

Editors and Affiliations

Stellenbosch University, Stellenbosch, South Africa
Bernd Fischer
Reykjavík University, Reykjavik, Iceland
Tarmo Uustalu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lauko, H., Ročkai, P., Barnat, J. (2018). Symbolic Computation via Program Transformation. In: Fischer, B., Uustalu, T. (eds) Theoretical Aspects of Computing – ICTAC 2018. ICTAC 2018. Lecture Notes in Computer Science(), vol 11187. Springer, Cham. https://doi.org/10.1007/978-3-030-02508-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-02508-3_17
Published: 15 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02507-6
Online ISBN: 978-3-030-02508-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics