Multi-query Optimization in Federated RDF Systems

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10827)

Abstract

This paper revisits the classical problem of multiple query optimization in federated RDF systems. We propose a heuristic query rewriting-based approach to share the common computation during evaluation of multiple queries while considering the cost of both query evaluation and data shipment. Furthermore, we propose an efficient method to use the interconnection topology between RDF sources to filter out irrelevant sources and share the common computation of intermediate results joining. The experiments over both real and synthetic RDF datasets show that our techniques are efficient.

Notes

Acknowledgement

This work was supported by The National Key Research and Development Program of China under grant 2016YFB1000603, NSFC under grant 61702171, 61622201 and 61532010, and the Fundamental Research Funds for the Central Universities. Özsu’s work was supported in part by Natural Sciences and Research Council (NSERC) of Canada.

References

  1. 1.
    Acosta, M., Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: an adaptive query processing engine for SPARQL endpoints. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 18–34. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-25073-6_2CrossRefGoogle Scholar
  2. 2.
    Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11964-9_13CrossRefGoogle Scholar
  3. 3.
    Berners-Lee, T.: Linked Data - Design Issues. W3C (2010)Google Scholar
  4. 4.
    Chvatal, V.: A greedy heuristic for the set-covering problem. Math. Oper. Res. 4(3), 233–235 (1979)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Görlitz, O., Staab, S.: SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. In: COLD (2011)Google Scholar
  6. 6.
    Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K., Umbrich, J.: Data summaries for on-demand queries over linked data. In: WWW, pp. 411–420 (2010)Google Scholar
  7. 7.
    Hose, K., Schenkel, R., Theobald, M., Weikum, G.: Database foundations for scalable RDF processing. In: Polleres, A., d’Amato, C., Arenas, M., Handschuh, S., Kroner, P., Ossowski, S., Patel-Schneider, P. (eds.) Reasoning Web 2011. LNCS, vol. 6848, pp. 202–249. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-23032-5_4CrossRefGoogle Scholar
  8. 8.
    Karypis, G., Kumar, V.: Multilevel graph partitioning schemes. In: ICPP, pp. 113–122 (1995)Google Scholar
  9. 9.
    Konstantinidis, G., Ambite, J.L.: Optimizing query rewriting for multiple queries. In: IIWeb, pp. 7:1–7:6 (2012)Google Scholar
  10. 10.
    Kossmann, D.: The state of the art in distributed query processing. ACM Comput. Surv. 32(4), 422–469 (2000)CrossRefGoogle Scholar
  11. 11.
    Le, W., Kementsietsidis, A., Duan, S., Li, F.: Scalable multi-query optimization for SPARQL. In: ICDE, pp. 666–677 (2012)Google Scholar
  12. 12.
    Li, J., Deshpande, A., Khuller, S.: Minimizing communication cost in distributed multi-query processing. In: ICDE, pp. 772–783 (2009)Google Scholar
  13. 13.
    Prasser, F., Kemper, A., Kuhn, K.A.: Efficient distributed query processing for autonomous RDF databases. In: EDBT, pp. 372–383 (2012)Google Scholar
  14. 14.
    Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-68234-9_39CrossRefGoogle Scholar
  15. 15.
    Saleem, M., Ngonga Ngomo, A.-C.: HiBISCuS: hypergraph-based source selection for SPARQL endpoint federation. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 176–191. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-07443-6_13CrossRefGoogle Scholar
  16. 16.
    Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 245–260. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11964-9_16CrossRefGoogle Scholar
  17. 17.
    Schmidt, M., Görlitz, O., Haase, P., Ladwig, G., Schwarte, A., Tran, T.: FedBench: a benchmark suite for federated semantic data query processing. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 585–600. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-25073-6_37CrossRefGoogle Scholar
  18. 18.
    Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-25073-6_38CrossRefGoogle Scholar
  19. 19.
    Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: WWW, pp. 595–604 (2008)Google Scholar
  20. 20.
    Yan, X., Han, J.: gSpan: graph-based substructure pattern mining. In: ICDM, pp. 721–724 (2002)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Peng Peng
    • 1
  • Lei Zou
    • 2
    • 3
  • M. Tamer Özsu
    • 4
  • Dongyan Zhao
    • 2
  1. 1.Hunan UniversityChangshaChina
  2. 2.Peking UniversityBeijingChina
  3. 3.Beijing Institute of Big Data ResearchBeijingChina
  4. 4.University of WaterlooWaterlooCanada

Personalised recommendations