Advertisement

Optimizing SPARQL Query Processing on Dynamic and Static Data Based on Query Time/Freshness Requirements Using Materialization

  • Soheila DehghanzadehEmail author
  • Josiane Xavier Parreira
  • Marcel Karnstedt
  • Juergen Umbrich
  • Manfred Hauswirth
  • Stefan Decker
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8943)

Abstract

To integrate various Linked Datasets, data warehousing and live query processing provide two extremes for optimized response time and quality respectively. The first approach provides very fast responses but with low-quality because changes of original data are not immediately reflected on materialized data. The second approach provides accurate responses but it is notorious for long response times. A hybrid SPARQL query processor provides a middle ground between two specified extremes by splitting the triple patterns of SPARQL query between live and local processors based on a predetermined coherence threshold specified by the administrator. Considering quality requirements while splitting the SPARQL query, enables the processor to eliminate the unnecessary live execution and releases resources for other queries. This requires estimating the quality of response provided with current materialized data, compare it with user requirements and determine the most selective sub-queries which can boost the response quality up to the specified level with least computational complexity. In this work, we propose solutions for estimating the freshness of materialized data, as one dimension of the quality, by extending cardinality estimation techniques. Experimental results show that we can estimate the freshness of materialized data with a low error rate.

Keywords

Freshness estimation Hybrid SPARQL Querying RDF Data Warehouse View Materialization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bishop, B., Kiryakov, A., Ognyanov, D., Peikov, I., Tashev, Z., Velkov, R.: Factforge: A fast track to the web of data. Semantic Web 2(2), 157–166 (2011)Google Scholar
  2. 2.
    Bizer, C., Schultz, A.: The berlin sparql benchmark. International Journal on Semantic Web and Information Systems (IJSWIS) 5, 1–24 (2009)Google Scholar
  3. 3.
    Bouzeghoub, M.: A framework for analysis of data freshness. In: Proceedings of the 2004 International Workshop on Information Quality in Information Systems, pp. 59–67. ACM (2004)Google Scholar
  4. 4.
    Castillo, R., Rothe, C., Ulf, L.: Idexing RDF Data for SPARQL Queries. Professoren des Inst. für Informatik, RDFMatView (2010)Google Scholar
  5. 5.
    Dey, D., Kumar, S.: Data quality of query results with generalized selection conditions. Operations Research 61(1), 17–31 (2013)CrossRefzbMATHMathSciNetGoogle Scholar
  6. 6.
    Goasdoué, F., Karanasos, K., Leblay, J., Manolescu, I.: View selection in semantic web databases. Proceedings of the VLDB Endowment 5(2), 97–108 (2011)CrossRefGoogle Scholar
  7. 7.
    Goldstein, J., Per-Åke, L.: Optimizing queries using materialized views: a practical, scalable solution. In: ACM SIGMOD Record 30, pp. 331–342. ACM (2001)Google Scholar
  8. 8.
    Hartig, O., Bizer, C., Freytag, J.-C.: Executing sparql queries over the web of linked data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  9. 9.
    Kuno, H., Graefe, G.: Deferred maintenance of indexes and of materialized views. In: Kikuchi, S., Madaan, A., Sachdeva, S., Bhalla, S. (eds.) DNIS 2011. LNCS, vol. 7108, pp. 312–323. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  10. 10.
    Labrinidis, A., Qu, H., Xu, J.: Quality contracts for real-time enterprises. In: Bussler, C.J., Castellanos, M., Dayal, U., Navathe, S. (eds.) BIRTE 2006. LNCS, vol. 4365, pp. 143–156. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  11. 11.
    Labrinidis, A., Roussopoulos, N.: Exploring the tradeoff between performance and data freshness in database-driven web servers. The VLDB Journal 13(3), 240–255 (2004)CrossRefGoogle Scholar
  12. 12.
    Lupei, D., Shaikhha, A., Koch, C., Nötzli, A.: Oliver Andrzej Kennedy, Milos Nikolic, and Yanif Ahmad. Higher-order delta processing for dynamic, frequently fresh views. Technical report, Dbtoaster (2013)Google Scholar
  13. 13.
    Neumann, T., Moerkotte, G.: Characteristic sets: Accurate cardinality estimation for rdf queries with multiple joins. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 984–994. IEEE (2011)Google Scholar
  14. 14.
    Parssian, A., Sarkar, S., Jacob, V.S.: Assessing information quality for the composite relational operation join. In: IQ, pp. 225–237 (2002)Google Scholar
  15. 15.
    Poosala, V., Haas, P.J., Loannidis, Y.E., Shekita, E.J.: Improved histograms for selectivity estimation of range predicates. ACM SIGMOD Record 25(2), 294–305 (1996)CrossRefGoogle Scholar
  16. 16.
    Tummarello, G., Delbru, R., Oren, E.: Sindice.com: weaving the open linked data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  17. 17.
    Umbrich, J., Hose, K., Karnstedt, M., Harth, A., Polleres, A.: Comparing data summaries for processing live queries over linked data. World Wide Web 14(5–6), 495–544 (2011)CrossRefGoogle Scholar
  18. 18.
    Parreira, J.X., Umbrich, J., Karnstedt, M., Hogan, A.: Hybrid sparql queries: fresh vs. fast results. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 608–624. Springer, Heidelberg (2012) CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Soheila Dehghanzadeh
    • 1
    Email author
  • Josiane Xavier Parreira
    • 2
  • Marcel Karnstedt
    • 3
  • Juergen Umbrich
    • 4
  • Manfred Hauswirth
    • 5
  • Stefan Decker
    • 1
  1. 1.Insight Centre for Data AnalyticsNational University of IrelandGalwayRepublic of Ireland
  2. 2.Siemens AG SterreichWienAustria
  3. 3.Bell LabsDublinIreland
  4. 4.Vienna University of Economics and BusinessViennaAustria
  5. 5.TU Berlin and Fraunhofer FOKUSBerlinGermany

Personalised recommendations