Skip to main content

Symbolic Computation via Program Transformation

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 11187)

Abstract

Symbolic computation is an important approach in automated program analysis. Most state-of-the-art tools perform symbolic computation as interpreters and directly maintain symbolic data. In this paper, we show that it is feasible, and in fact practical, to use a compiler-based strategy instead. Using compiler tooling, we propose and implement a transformation which takes a standard program and outputs a program that performs a semantically equivalent, but partially symbolic, computation. The transformed program maintains symbolic values internally and operates directly on them; therefore, the program can be processed by a tool without support for symbolic manipulation.

The main motivation for the transformation is in symbolic verification, but there are many other possible use-cases, including test generation and concolic testing. Moreover, using the transformation simplifies tools, since the symbolic computation is handled by the program directly. We have implemented the transformation at the level of LLVM bitcode. The paper includes an experimental evaluation, based on an explicit-state software model checker as a verification backend.

This work has been partially supported by the Czech Science Foundation grant 18-02177S and by Red Hat, Inc.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-02508-3_17
  • Chapter length: 20 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   79.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-02508-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.

Notes

  1. 1.

    In fact, any control-explicit, data-symbolic model checker already contains a subroutine (in our case about 200 lines) which effectively implements a symbolic executor.

  2. 2.

    https://divine.fi.muni.cz/2018/sym.

  3. 3.

    CIL is short for C Intermediate Language [23], and is a simplified subset of the C language. The toolkit can automatically translate standard C into the intermediate (CIL) form. The CIL form can be optionally brought into the form of three-address code and this feature is used in CREST.

  4. 4.

    Programs in LLVM are in a partial SSA form, a special case of three-address code [1]. Three-address code is essentially small-step semantics in an executable form.

  5. 5.

    Additionally, since the SSA portion of the LLVM IR is already statically typed, we can take advantage of this existing type system in the implementation. Nonetheless, the treatment in this section does not depend on LLVM and would be applicable to any dataflow-oriented program representation.

  6. 6.

    Since multiple abstract domains can co-exist in a single program, we use the lower index \(i\) to distinguish them.

  7. 7.

    The exact layout of data (structures, arrays, dynamic memory) is normally the responsibility of the program itself, more so in the case of intermediate or low-level languages. For this reason, it is often the case that the program will make various assumptions about relationships among addresses within the same memory object. It is impractical, if not impossible, to automatically adapt the program to a different data layout, e.g. in case the size of a scalar value would change due to abstraction.

  8. 8.

    The only way a value can be copied from one memory address to another is via a load instruction followed by a store, both of which are instrumented and as such also transfer the supplementary data.

  9. 9.

    For instance, concrete operands to abstract operations are lifted, arguments to necessarily concrete functions (e.g. real system calls) are lowered. Memory stores are replaced with freeze and loads with thaw.

  10. 10.

    The names lift and lower allude to the relationship of the abstract and the concrete domain. In applications with multiple abstract domains, it may be expedient to include additional instructions that convert directly from one abstract domain to another, although in theory it is always possible to go through the concrete domain.

  11. 11.

    Again, this is true of LLVM bitcode – it is already in a partial SSA form. This simplifies our prototype implementation somewhat.

  12. 12.

    An abstract domain is called relational when it is capable of preserving information about relationships among various abstract values that appear in the program.

  13. 13.

    In the present paper, we only deal with abstract (symbolic) values. The structure of the program state, that is, the arrangement of the program memory, is taken to be always represented explicitly, i.e., it belongs squarely to the concrete domain.

  14. 14.

    We have used the utility sloccount to get estimates of module size in terms of lines of code.

References

  1. Aho, A.V.: Compilers: Principles, Techniques, and Tools. Addison-Wesley Series in Computer Science. Pearson/Addison Wesley, Boston (2007)

    Google Scholar 

  2. Albarghouthi, A., Gurfinkel, A., Chechik, M.: From under-approximations to over-approximations and back. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 157–172. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28756-5_12

    CrossRef  MATH  Google Scholar 

  3. Barrett, C., Fontaine, P., Tinelli, C.: SMT-LIB: the satisfiability modulo theories library. http://www.smt-lib.org/

  4. Bauch, P., Havel, V., Barnat, J.: Control explicit-data symbolic model checking. ACM Trans. Softw. Eng. Methodol. 25(2) (2016). Article no. 15. https://doi.org/10.1145/2888393

    CrossRef  Google Scholar 

  5. Beyer, D.: Reliable and reproducible competition results with BenchExec and witnesses (report on SV-COMP 2016). In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 887–904. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49674-9_55

    CrossRef  Google Scholar 

  6. Beyer, D., Keremoglu, M.E.: CPAchecker: a tool for configurable software verification. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 184–190. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_16

    CrossRef  Google Scholar 

  7. Beyer, D., Löwe, S.: Interpolation for value analysis. In: Aßmann, U., Demuth, B., Spitta, T., Püschel, G., Kaiser, R. (eds.) Software Engineering and Management. Lecture Notes in Informatics, vol. 239, pp. 73–74. Gesellschaft für Informatik, Bonn (2015). https://dl.gi.de/handle/20.500.12116/2495

  8. Beyer, B., Henzinger, T.A., Jhala, R., Majumdar, R.: The software model checker Blast. Int. J. Softw. Tools Technol. Transfer 9(5), 505–525 (2007). https://doi.org/10.1007/s10009-007-0044-z

    CrossRef  Google Scholar 

  9. Burnim, J., Sen, K.: Heuristics for scalable dynamic test generation. In: Proceedings of 23rd IEEE/ACM International Conference on Automated Software Engineering, ASE 2008, L’Aquila, September 2008, pp. 443–446. IEEE CS Press, Washington, DC (2008). https://doi.org/10.1109/ase.2008.69

  10. Cadar, C., Dunbar, D., Engler, D.R.: KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of 8th USENIX Symposium on Operating Systems Design and Implementation, San Diego, CA, December 2008, pp. 209–224. USENIX Association (2008). http://www.usenix.org/events/osdi08/tech/full_papers/cadar/cadar.pdf

  11. Cavada, R., et al.: The nuXmv symbolic model checker. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 334–342. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08867-9_22

    CrossRef  Google Scholar 

  12. Chalupa, M., Vitovská, M., Jonáš, M., Slaby, J., Strejček, J.: Symbiotic 4: beyond reachability. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10206, pp. 385–389. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54580-5_28

    CrossRef  Google Scholar 

  13. Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided abstraction refinement. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 154–169. Springer, Heidelberg (2000). https://doi.org/10.1007/10722167_15

    CrossRef  Google Scholar 

  14. Daniel, J., Parízek, P.: PANDA: simultaneous predicate abstraction and concrete execution. In: Piterman, N. (ed.) HVC 2015. LNCS, vol. 9434, pp. 87–103. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26287-1_6

    CrossRef  Google Scholar 

  15. Ganesh, V., Dill, D.L.: A decision procedure for bit-vectors and arrays. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73368-3_52

    CrossRef  Google Scholar 

  16. Havelund, K., Pressburger, T.: Model checking JAVA programs using JAVA PathFinder. Int. J. Softw. Tools Technol. Transfer 2(4), 366–381 (2000). https://doi.org/10.1007/s100090050043

    CrossRef  MATH  Google Scholar 

  17. Khurshid, S., Păsăreanu, C.S., Visser, W.: Generalized symbolic execution for model checking and testing. In: Garavel, H., Hatcliff, J. (eds.) TACAS 2003. LNCS, vol. 2619, pp. 553–568. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36577-X_40

    CrossRef  MATH  Google Scholar 

  18. King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976). https://doi.org/10.1145/360248.360252

    MathSciNet  CrossRef  MATH  Google Scholar 

  19. Kroening, D., Tautschnig, M.: CBMC – C bounded model checker. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 389–391. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54862-8_26

    CrossRef  Google Scholar 

  20. Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis and transformation. In: Proceedings of 2nd IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2004, Palo Alto, CA, March 2004, pp. 75–88. IEEE CS Press, Washington, DC (2004). https://doi.org/10.1109/cgo.2004.1281665

  21. McMillan, K.L.: Symbolic Model Checking. Kluwer Academic Publishers, Boston (1993). https://doi.org/10.1007/978-1-4615-3190-6

    CrossRef  MATH  Google Scholar 

  22. Mrázek, J., Bauch, P., Lauko, H., Barnat, J.: SymDIVINE: tool for control-explicit data-symbolic state space exploration. In: Bošnački, D., Wijs, A. (eds.) SPIN 2016. LNCS, vol. 9641, pp. 208–213. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32582-8_14

    CrossRef  Google Scholar 

  23. Necula, G.C., McPeak, S., Rahul, S.P., Weimer, W.: CIL: intermediate language and tools for analysis and transformation of C programs. In: Horspool, R.N. (ed.) CC 2002. LNCS, vol. 2304, pp. 213–228. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45937-5_16

    CrossRef  Google Scholar 

  24. Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-662-03811-6

    CrossRef  MATH  Google Scholar 

  25. Sen, K., Agha, G.: CUTE and jCUTE: concolic unit testing and explicit path model-checking tools. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 419–423. Springer, Heidelberg (2006). https://doi.org/10.1007/11817963_38

    CrossRef  Google Scholar 

  26. Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for C. In: Proceedings of Joint 10th European Software Engineering Conference and 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE 2005, Lisbon, September 2005, pp. 263–272. ACM Press, New York (2005). https://doi.org/10.1145/1081706.1081750

  27. Sousa, M., Rodríguez, C., D’Silva, V., Kroening, D.: Abstract interpretation with unfoldings. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 197–216. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_11

    CrossRef  Google Scholar 

  28. Weißenbacher, G.: Program analysis with interpolants. Ph.D. thesis, University of Oxford (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henrich Lauko .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Lauko, H., Ročkai, P., Barnat, J. (2018). Symbolic Computation via Program Transformation. In: Fischer, B., Uustalu, T. (eds) Theoretical Aspects of Computing – ICTAC 2018. ICTAC 2018. Lecture Notes in Computer Science(), vol 11187. Springer, Cham. https://doi.org/10.1007/978-3-030-02508-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02508-3_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02507-6

  • Online ISBN: 978-3-030-02508-3

  • eBook Packages: Computer ScienceComputer Science (R0)