Mining the Archive of Formal Proofs

  • Jasmin Christian Blanchette
  • Maximilian Haslbeck
  • Daniel Matichuk
  • Tobias NipkowEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9150)


The Archive of Formal Proofs is a vast collection of computer-checked proofs developed using the proof assistant Isabelle. We perform an in-depth analysis of the archive, looking at various properties of the proof developments, including size, dependencies, and proof style. This gives some insights into the nature of formal proofs.


Statement Size Horn Clause Proof Assistant Lemma Statement Citation Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



Our colleague Johannes Hölzl suggested the proof depth diagram (Fig. 5), which we found insightful. Matichuk is partially supported by NICTA. NICTA is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre of Excellence Program. Nipkow is supported by DFG grant NI 491/16-1.


  1. 1.
    The Mizar mathematical library.
  2. 2.
    An, Y., Janssen, J., Milios, E.: Characterizing and mining citation graphs of the computer science literature. Knowl. Inf. Syst. 6, 664–678 (2004)CrossRefGoogle Scholar
  3. 3.
    Blanchette, J.C., Böhme, S., Fleury, M., Smolka, S.J., Steckermeier, A.: Semi-intelligible Isar proofs from machine-generated proofs. Accepted in J. Autom. Reason.
  4. 4.
    Blanchette, J.C., Böhme, S., Paulson, L.C.: Extending Sledgehammer with SMT solvers. J. Autom. Reason. 51(1), 109–128 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Blanchette, J.C., Greenaway, D., Kaliszyk, C., Kühlwein, D., Urban, J.: A learning-based relevance filter for Isabelle/HOL (2015) (Submitted).
  6. 6.
    Böhme, S., Nipkow, T.: Sledgehammer: judgement day. In: Giesl, J., Hähnle, R. (eds.) IJCAR 2010. LNCS, vol. 6173, pp. 107–121. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  7. 7.
    Chaudhuri, K., Doligez, D., Lamport, L., Merz, S.: The TLA\(^ \text{+ } \) proof system: building a heterogeneous verification platform. In: Cavalcanti, A., Deharbe, D., Gaudel, M.-C., Woodcock, J. (eds.) ICTAC 2010. LNCS, vol. 6255, p. 44. Springer, Heidelberg (2010)Google Scholar
  8. 8.
    Church, A.: A formulation of the simple theory of types. J. Symb. Logic 5(2), 56–68 (1940)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Crovella, M.E., Bestavros, A.: Self-similarity in world wide web traffic: evidence and possible causes. IEEE/ACM Trans. Network. 5(6), 835–846 (1997)CrossRefGoogle Scholar
  10. 10.
    de Moura, L., Bjørner, N.S.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  11. 11.
    Gonthier, G., et al.: A machine-checked proof of the odd order theorem. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 163–179. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  12. 12.
    Hölzl, J., Immler, F., Huffman, B.: Type classes and filters for mathematical analysis in Isabelle/HOL. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 279–294. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  13. 13.
    Kaliszyk, C., Urban, J.: HOL(y)Hammer: online ATP service for HOL light. Math. Comput. Sci. 9(1), 5–22 (2015)CrossRefzbMATHGoogle Scholar
  14. 14.
    Lammich, P., Lochbihler, A.: The Isabelle collections framework. In: Kaufmann, M., Paulson, L.C. (eds.) ITP 2010. LNCS, vol. 6172, pp. 339–354. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  15. 15.
    Leroy, X.: A formally verified compiler back-end. J. Autom. Reason. 43, 363–446 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Lochbihler, A.: Java and the Java memory model — a unified, machine-checked formalisation. In: Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 497–517. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  17. 17.
    Matichuk, D., Murray, T., Andronick, J., Jeffery, R., Klein, G., Staples, M.: Empirical study towards a leading indicator for cost of formal software verification. In: Canfora, G., Elbaum, S. (eds.) International Conference on Software Engineering (ICSE 2015). ACM (2015)Google Scholar
  18. 18.
    Matuszewski, R., Rudnicki, P.: Mizar: the first 30 years. Mech. Math. Appl. 4(1), 3–24 (2005)Google Scholar
  19. 19.
    Meng, J., Paulson, L.C.: Lightweight relevance filtering for machine-generated resolution problems. J. Appl. Logic 7(1), 41–57 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Nipkow, T., Klein, G.: Concrete Semantics with Isabelle/HOL. Springer, Heidelberg (2014). zbMATHGoogle Scholar
  21. 21.
    Nipkow, T., Paulson, L.C., Wenzel, M. (eds.): Isabelle/HOL. LNCS, vol. 2283. Springer, Heidelberg (2002) zbMATHGoogle Scholar
  22. 22.
    Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. In: Sutcliffe, G., Schulz, S., Ternovska, E. (eds.) International Workshop on the Implementation of Logics (IWIL 2010). EPiC Series, vol. 2, pp. 1–11. EasyChair (2012)Google Scholar
  23. 23.
    Riazanov, A., Voronkov, A.: The design and implementation of Vampire. AI Commun. 15(2–3), 91–110 (2002)zbMATHGoogle Scholar
  24. 24.
    Schulz, S.: System description: E 1.8. In: McMillan, K., Middeldorp, A., Voronkov, A. (eds.) LPAR-19 2013. LNCS, vol. 8312, pp. 735–743. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  25. 25.
    Staples, M., Jeffery, R., Andronick, J., Murray, T., Klein, G., Kolanski, R.: Productivity for proof engineering. In: Morisio, M., Dybå, T., Torchiano, M. (eds.) Empirical Software Engineering and Measurement (ESEM 2014), pp. 15:1–15:4. ACM, New York (2014)Google Scholar
  26. 26.
    Urban, J.: MPTP 0.2: design, implementation, and initial experiments. J. Autom. Reason. 37(1–2), 21–43 (2006)zbMATHGoogle Scholar
  27. 27.
    Urban, J., Rudnicki, P., Sutcliffe, G.: ATP and presentation service for Mizar formalizations. J. Autom. Reason. 50(2), 229–241 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS version 3.5. In: Schmidt, R.A. (ed.) CADE-22. LNCS, vol. 5663, pp. 140–145. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  29. 29.
    Wenzel, M.: Isabelle/Isar—a versatile environment for human-readable formal proof documents. Ph.D. thesis, Institut für Informatik, Technische Universität München (2002).
  30. 30.
    Wiedijk, F.: Statistics on digital libraries of mathematics. Stud. Logic, Gramm. Rhetor. 18(31), 137–151 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Jasmin Christian Blanchette
    • 1
    • 2
  • Maximilian Haslbeck
    • 3
  • Daniel Matichuk
    • 4
    • 5
  • Tobias Nipkow
    • 3
    Email author
  1. 1.Inria Nancy and LORIAVillers-lès-NancyFrance
  2. 2.Max-Planck-Institut Für InformatikSaarbrückenGermany
  3. 3.Fakultät für InformatikTechnische Universität MünchenMunichGermany
  4. 4.NICTASydneyAustralia
  5. 5.University of New South WalesSydneyAustralia

Personalised recommendations