JustBench: A Framework for OWL Benchmarking

  • Samantha Bail
  • Bijan Parsia
  • Ulrike Sattler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6496)


Analysing the performance of OWL reasoners on expressive OWL ontologies is an ongoing challenge. In this paper, we present a new approach to performance analysis based on justifications for entailments of OWL ontologies. Justifications are minimal subsets of an ontology that are sufficient for an entailment to hold, and are commonly used to debug OWL ontologies. In JustBench, justifications form the key unit of test, which means that individual justifications are tested for correctness and reasoner performance instead of entire ontologies or random subsets. Justifications are generally small and relatively easy to analyse, which makes them very suitable for transparent analytic micro-benchmarks. Furthermore, the JustBench approach also allows us to isolate reasoner errors and inconsistent behaviour. We present the results of initial experiments using JustBench with FaCT++, HermiT, and Pellet. Finally, we show how JustBench can be used by reasoner developers and ontology engineers seeking to understand and improve the performance characteristics of reasoners and ontologies.


Description Logic Reasoner Performance Conjunctive Query Benchmark Suite Selection Principle 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bail, S., Parsia, B., Sattler, U.: The justificatory structure of OWL ontologies. In: OWLED (2010)Google Scholar
  2. 2.
    Boyer, B.: Robust Java benchmarking (2008),
  3. 3.
    Brandt, S., Küsters, R., Turhan, A.-Y.: Approximation and difference in description logics. In: Proc. of KR 2002. Morgan Kaufmann Publishers, San Francisco (2002)Google Scholar
  4. 4.
    Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontologies: Theory and practice. J. of Artificial Intelligence Research 31, 273–318 (2008)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Cuenca Grau, B., Horrocks, I., Motik, B., Parsia, B., Patel-Schneider, P., Sattler, U.: OWL 2: The next step for OWL. J. of Web Semantics 6(4), 309–322 (2008)CrossRefGoogle Scholar
  6. 6.
    Dewitt, D.J.: The Wisconsin benchmark: Past, present, and future. In: Gray, J. (ed.) The Benchmark Handbook for Database and Transaction Processing Systems. Morgan Kaufmann Publishers Inc., San Francisco (1992)Google Scholar
  7. 7.
    Franco, J.V.: On the probabilistic performance of algorithms for the satisfiability problem. Inf. Process. Lett. 23(2), 103–106 (1986)CrossRefGoogle Scholar
  8. 8.
    Gardiner, T., Tsarkov, D., Horrocks, I.: Framework for an automated comparison of description logic reasoners. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 654–667. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. Web Semantics: Science, Services and Agents on the World Wide Web 3(2-3), 158–182 (2005)CrossRefGoogle Scholar
  10. 10.
    Horrocks, I., Patel-Schneider, P.F.: Evaluating optimized decision procedures for propositional modal K(m) satisfiability. J. Autom. Reasoning 28(2), 173–204 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Hustadt, U., Schmidt, R.A.: Scientific benchmarking with temporal logic decision procedures. In: Fensel, D., Giunchiglia, F., McGuinness, D.L., Williams, M.-A. (eds.) KR, pp. 533–546. Morgan Kaufmann, San Francisco (2002)Google Scholar
  12. 12.
    Lin, H., Sirin, E.: Pellint - a performance lint tool for pellet. In: Dolbear, C., Ruttenberg, A., Sattler, U. (eds.) OWLED. CEUR Workshop Proceedings, vol. 432. (2008)Google Scholar
  13. 13.
    Ma, L., Yang, Y., Qiu, Z., Xie, G.T., Pan, Y., Liu, S.: Towards a complete OWL ontology benchmark. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 125–139. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Pan, Z.: Benchmarking DL reasoners using realistic ontologies. In: OWLED (2005)Google Scholar
  15. 15.
    Parsia, B., Schneider, T.: The modular structure of an ontology: An empirical study. In: Lin, F., Sattler, U., Truszczynski, M. (eds.) KR. AAAI Press, Menlo Park (2010)Google Scholar
  16. 16.
    Rudolph, S., Tserendorj, T., Hitzler, P.: What is approximate reasoning? In: Calvanese, D., Lausen, G. (eds.) RR 2008. LNCS, vol. 5341, pp. 150–164. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Schaerf, M., Cadoli, M.: Tractable reasoning via approximation. Artificial Intelligence 74, 249–310 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Selman, B., Mitchell, D.G., Levesque, H.J.: Generating hard satisfiability problems. Artif. Intell. 81(1-2), 17–29 (1996)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Wang, T., Parsia, B.: Ontology performance profiling and model examination: first steps. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, p. 595. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  20. 20.
    Weithöner, T., Liebig, T., Luther, M., Böhm, S.: What’s wrong with OWL benchmarks? In: SSWS (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Samantha Bail
    • 1
  • Bijan Parsia
    • 1
  • Ulrike Sattler
    • 1
  1. 1.The University of ManchesterManchesterU.K.

Personalised recommendations