Sharing Statistics for SPARQL Federation Optimization, with Emphasis on Benchmark Quality

  • Kjetil Kjernsmo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7295)

Abstract

Federation of semantic data on SPARQL endpoints will allow data to remain distributed so that it can be controlled by local curators and swiftly updated. There are considerable performance problems, which the present work proposes to address, mainly by computation and exposure of statistical digests to assist selectivity estimation.

For an objective evaluation as well as comparison of engines, benchmarks that exhaustively covers the parameter space is required. We propose an investigation into this problem using statistical experimental planning.

References

  1. 1.
    Buil-Aranda, C., Arenas, M., Corcho, O.: Semantics and Optimization of the SPARQL 1.1 Federation Extension. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part II. LNCS, vol. 6644, pp. 1–15. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    Duan, S., Kementsietsidis, A., Srinivas, K., Udrea, O.: Apples and oranges: a comparison of RDF benchmarks and real RDF datasets. In: Proc. of the 2011 Int. Conf. on Management of Data, SIGMOD 2011, pp. 145–156. ACM (2011)Google Scholar
  3. 3.
    Getoor, L., Taskar, B., Koller, D.: Selectivity estimation using probabilistic models. In: Proc. of the 2001 ACM SIGMOD Int. Conf. on Management of Data, SIGMOD 2001, pp. 461–472. ACM (2001)Google Scholar
  4. 4.
    Görlitz, O., Staab, S.: Federated Data Management and Query Optimization for Linked Open Data. In: Vakali, A., Jain, L.C. (eds.) New Directions in Web Data Management 1. SCI, vol. 331, pp. 109–137. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
    Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K.-U., Umbrich, J.: Data summaries for on-demand queries over linked data. In: Proc. of the 19th Int. Conf. on World Wide Web, WWW 2010, pp. 411–420. ACM (2010)Google Scholar
  6. 6.
    Morsey, M., Lehmann, J., Auer, S., Ngomo, A.-C.N.: DBpedia SPARQL Benchmark – Performance Assessment with Real Queries on Real Data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Schmidt, M., Görlitz, O., Haase, P., Ladwig, G., Schwarte, A., Tran, T.: FedBench: A Benchmark Suite for Federated Semantic Data Query Processing. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 585–600. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  8. 8.
    Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: Optimization Techniques for Federated Query Processing on Linked Data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Kjetil Kjernsmo
    • 1
  1. 1.Department of InformaticsUniversity of OsloOsloNorway

Personalised recommendations