Skip to main content
Log in

Improving SPARQL query performance with algebraic expression tree based caching and entity caching

  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

To obtain comparable high query performance with relational databases, diverse database technologies have to be adapted to confront the complexity posed by both Resource Description Framework (RDF) data and SPARQL query. Database caching is one of such technologies that improves the performance of database with reasonable space expense based on the spatial/temporal/semantic locality principle. However, existing caching schemes exploited in RDF stores are found to be dysfunctional for complex query semantics. Although semantic caching approaches work effectively in this case, little work has been done in this area. In this paper, we try to improve SPARQL query performance with semantic caching approaches, i.e., SPARQL algebraic expression tree (AET) based caching and entity caching. Successive queries with multiple identical sub-queries and star-shaped joins can be efficiently evaluated with these two approaches. The approaches are implemented on a two-level-storage structure. The main memory stores the most frequently accessed cache items, and items swapped out are stored on the disk for future possible reuse. Evaluation results on three mainstream RDF benchmarks illustrate the effectiveness and efficiency of our approaches. Comparisons with previous research are also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J., 2007. Scalable Semantic Web Data Management Using Vertical Partitioning. 33rd Int. Conf. on Very Large Data Bases, p.411–422.

  • Bizer, C., Schultz, A., 2009. The Berlin SPARQL Benchmark. Int. J. Semant. Web Inform. Syst., 5(2):1–24. [doi:10.4018/jswis.2009040101]

    Article  Google Scholar 

  • Broekstra, J., Kampman, A., van Harmelen, F., 2002. Sesame: a generic architecture for storing and querying RDF and RDF schema. LNCS, 2342:54–68. [doi:10.1007/3-540-48005-6_7]

    Google Scholar 

  • Castillo, R., Leser, U., Rothe, C., 2010. RDFMatView: Indexing RDF Data for SPARQL Queries. Technical Report, Humboldt University, Berlin, Germany.

    Google Scholar 

  • Chen, L., Rundensteiner, E.A., Wang, S., 2002. XCache: a Semantic Caching System for XML Queries. ACM SIGMOD Int. Conf. on Management of Data, p.618. [doi:10.1145/564691.564771]

  • Chong, E.I., Das, S., Eadon, G., Srinivasan, J., 2005. An Efficient SQL-Based RDF Querying Scheme. 31st Int. Conf. on Very Large Data Bases, p.1216–1227.

  • Dar, S., Franklin, M.J., Jónsson, B.T., Srivastava, D., Tan, M., 1996. Semantic Data Caching and Replacement. 22nd Int. Conf. on Very Large Data Bases, p.330–341.

  • Erling, O., Mikhailov, I., 2007. RDF Support in the Virtuoso DBMS. First Conf. on Social Semantic Web, p.59–68.

  • Guo, Y., Pan, Z., Heflin, J., 2005. LUBM: a benchmark for OWL knowledge base systems. Web Semant., 3(2–3):158–182. [doi:10.1016/j.websem.2005.06.005]

    Article  Google Scholar 

  • Harth, A., Umbrich, J., Hogan, A., Decker, S., 2007. YARS2: a federated repository for querying graph structured data from the Web. LNCS, 4825:211–224. [doi:10.1007/978-3-540-76298-0_16]

    Google Scholar 

  • Klyne, G., Carroll, J.J., 2004. Resource Description Framework (RDF): Concepts and Abstract Syntax. W3C Recommendation. Available from http://www.w3.org/TR/2004/REC-rdf-concepts-20040212/ [Accessed on Jan. 16, 2012].

  • Li, L., König-Ries, B., Pissinou, N., Makki, K., 2001. Strategies for Semantic Caching. 12th Int. Conf. on Database and Expert Systems Applications, p.284–298. [doi:10. 1007/3-540-44759-8_29]

  • Martin, M., Unbehauen, J., Auer, S., 2010. Improving the performance of semantic Web applications with SPARQL query caching. LNCS, 6089:304–318. [doi:10.1007/978-3-642-13489-0_21]

    Google Scholar 

  • Neumann, T., Weikum, G., 2008. RDF-3X: a risc-style engine for RDF. Proc. VLDB Endow., 1(1):647–659.

    Google Scholar 

  • Owens, A., Seaborne, A., Gibbins, N., Schraefel, M., 2008. Clustered TDB: a Clustered Triple Store for Jena. Available from http://eprints.ecs.soton.ac.uk/16974/1/www2009fixedref.pdf [Accessed on Jan. 16, 2012].

  • Prud’hommeaux, E., Seaborne, A., 2008. SPARQL Query Language for RDF. W3C Recommendation. Available from http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/ [Accessed on Jan. 16, 2012].

  • Ren, Q., Dunham, M.H., Kumar, V., 2003. Semantic caching and query processing. IEEE Trans. Knowl. Data Eng., 15(1):192–210. [doi:10.1109/TKDE.2003.1161590]

    Article  Google Scholar 

  • Ross, K.A., 2009. Cache-Conscious Query Processing. Encyclopedia of Database Systems, p.301–304. [doi:10.1007/978-0-387-39940-9_2151]

  • Sakr, S., Al-Naymat, G., 2010. Relational processing of RDF queries: a survey. ACM SIGMOD Rec., 38(4):23–28. [doi:10.1145/1815948.1815953]

    Article  Google Scholar 

  • Schmidt, M., Hornung, T., Lausen, G., Pinkel, C., 2008. SP2Bench: a SPARQL Performance Benchmark. IEEE 25th Int. Conf. on Data Engineering, p.222–233. [doi:10. 1109/ICDE.2009.28]

  • Wikipedia, 2012. Resource Description Framework. Available from http://en.wikipedia.org/wiki/Resource_Description_Framework [Accessed on Jan. 16, 2012].

  • Wilkinson, K., Sayers, C., Kuno, H.A., Reynolds, D., 2003. Efficient RDF Storage and Retrieval in Jena2. First Int. Workshop on Semantic Web and Databases, p.131–150.

Recommended reading

  • Dar, S., Franklin, M.J., Jónsson, B.T., Srivastava, D., Tan, M., 1996. Semantic Data Caching and Replacement. 22nd Int. Conf. on Very Large Data Bases, p.330–341.

  • Castillo, R., Leser, U., Rothe, C., 2010. RDFMatView: Indexing RDF Data for SPARQL Queries. Technical Report, Humboldt University.

  • Neumann, T., Weikum, G., 2008. RDF-3X: a RISC-style engine for RDF. Proc. VLDB Endow., 1(1):647–659.

    Google Scholar 

  • Martin, M., Unbehauen, J., Auer, S., 2010. Improving the performance of semantic Web applications with SPARQL query caching. LNCS, 6089:304–318. [doi:10.1007/978-3-642-13489-0_21]

    Google Scholar 

  • Broekstra, J., Kampman, A., van Harmelen, F., 2002. Sesame: a generic architecture for storing and querying RDF and RDF schema. LNCS, 2342:54–68. [doi:10.1007/3-540-48005-6_7]

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Wu.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 60903010, 61025007, and 60933001), the National Basic Research Program (973) of China (No. 2011CB302206), the Natural Science Foundation of Jiangsu Province, China (No. BK2009268), the Fundamental Research Funds for the Central Universities (No. N110404013), and the Key Laboratory of Advanced Information Science and Network Technology of Beijing (No. XDXX1011)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, G., Yang, Md. Improving SPARQL query performance with algebraic expression tree based caching and entity caching. J. Zhejiang Univ. - Sci. C 13, 281–294 (2012). https://doi.org/10.1631/jzus.C1101009

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.C1101009

Key words

CLC number

Navigation