Automating Software Analysis at Large Scale

  • Daniel Kroening
  • Michael TautschnigEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8934)


Actual software in use today is not known to follow any uniform normal distribution, whether syntactically—in the language of all programs described by the grammar of a given programming language, or semantically—for example, in the set of reachable states. Hence claims deduced from any given set of benchmarks need not extend to real-world software systems.

When building software analysis tools, this affects all aspects of tool construction: starting from language front ends not being able to parse and process real-world programs, over inappropriate assumptions about (non-)scalability, to failing to meet actual needs of software developers.

To narrow the gap between real-world software demands and software analysis tool construction, an experiment using the Debian Linux distribution has been set up. The Debian distribution presently comprises of more than \(22000\) source software packages. Focussing on C source code, more than \(400\) million lines of code are automatically analysed in this experiment, resulting in a number of improvements in analysis tools on the one hand, but also more than \(700\) public bug reports to date.


Model Checker Source Package Reachable State Intermediate Representation Program Language Design 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alglave, J., Donaldson, A.F., Kroening, D., Tautschnig, M.: Making software verification tools really work. In: Bultan, T., Hsiung, P.-A. (eds.) ATVA 2011. LNCS, vol. 6996, pp. 28–42. Springer, Heidelberg (2011)Google Scholar
  2. 2.
    Alglave, J., Kroening, D., Nimal, V., Poetzl, D.: Don’t sit on the fence - a static analysis approach to automatic fence insertion. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 508–524. Springer, Heidelberg (2014)Google Scholar
  3. 3.
    Alglave, J., Kroening, D., Nimal, V., Tautschnig, M.: Software verification for weak memory via program transformation. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 512–532. Springer, Heidelberg (2013)Google Scholar
  4. 4.
    Alglave, J., Maranget, L., Tautschnig, M.: Herding cats: modelling, simulation, testing, and data-mining for weak memory. In: Programming Language Design and Implementation (PLDI 2014), p. 7. ACM (2014)Google Scholar
  5. 5.
    Batty, M., Owens, S., Sarkar, S., Sewell, P., Weber, T.: Mathematizing C++ concurrency. In: Symposium on Principles of Programming Languages (POPL 2011), pp. 55–66. ACM (2011)Google Scholar
  6. 6.
    Beyer, D.: Status report on software verification - (competition summary SV-COMP 2014). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 373–388. Springer, Heidelberg (2014)Google Scholar
  7. 7.
    Boehm, H., Adve, S.V.: Foundations of the C++ concurrency memory model. In: Programming Language Design and Implementation (PLDI 2008), pp. 68–78. ACM (2008)Google Scholar
  8. 8.
    Clarke, E., Kroning, D., Lerda, F.: A tool for checking ANSI-C programs. In: Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp. 168–176. Springer, Heidelberg (2004)Google Scholar
  9. 9.
    Clarke, E.M., Kroening, D., Yorav, K.: Behavioral consistency of C and verilog programs using bounded model checking. In: Design Automation Conference (DAC 2003), pp. 368–371. ACM (2003)Google Scholar
  10. 10.
    Cook, B., Podelski, A., Rybalchenko, A.: Termination proofs for systems code. In: Programming Language Design and Implementation (PLDI 2006), pp. 415–426. ACM (2006)Google Scholar
  11. 11.
    Eén, N., Sörensson, N.: An extensible SAT-solver. In: Giunchiglia, E., Tacchella, A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)Google Scholar
  12. 12.
    Kroening, D., Lewis, M., Weissenbacher, G.: Under-approximating loops in C programs for fast counterexample detection. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 381–396. Springer, Heidelberg (2013)Google Scholar
  13. 13.
    Kroening, D., Tautschnig, M.: CBMC – C bounded model checker - (competition contribution). In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 389–391. Springer, Heidelberg (2014)Google Scholar
  14. 14.
    de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008)Google Scholar
  15. 15.
    Nussbaum, L., Zacchiroli, S.: The ultimate debian database: consolidating bazaar metadata for quality assurance and data mining. In: Mining Software Repositories (MSR 2010), pp. 52–61. IEEE (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.University of OxfordOxfordUK
  2. 2.Queen Mary University of LondonLondonUK

Personalised recommendations