Abstract
The use of RDF (Resource Description Framework) data is a cornerstone of the Semantic Web. RDF data embedded in Web pages may be indexed using semantic search engines, however, RDF data is often stored in databases, accessible via Web Services using the SPARQL query language for RDF, which form part of the Deep Web which is not accessible using search engines. This paper addresses the problem of effectively integrating RDF data stored in separate Web-accessible databases. An approach based on distributed query processing is described, where data from multiple repositories are used to construct partitioned tables that are integrated using an adaptive query processing technique supporting join reordering, which limits any reliance on statistics and metadata about SPARQL endpoints, as such information is often inaccurate or unavailable, but is required by existing systems supporting federated SPARQL queries. The approach presented extends existing approaches in this area by allowing tables to be added to the query plan while it is executing, and shows how an approach currently used within relational query processing can be applied to distributed SPARQL query processing. The approach is evaluated using a prototype implementation and potential applications are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Klyne, G., Carroll, J.J.: Resource description framework (rdf): Concepts and abstract syntax. Technical report, W3C (2004)
Berners-Lee, T.H., Lassila, O.J.: The Semantic Web. Scientific American 284(5), 28–37 (2001)
Eric Prudhommeaux and Andy Seaborne. SPARQL Query Language for RDF. Technical report, W3C (2008)
Bergman, M.K.: The Deep Web: Surfacing Hidden Value. The Journal of Electronic Publishing 7 (2001)
Clark, K.G., Feigenbaum, L., Torres, E.: SPARQL Protocol for RDF. Technical report, W3C (2008)
Beckett, D., Broekstra, J.: SPARQL Query Rresults XML Format. Technical report, W3C (2008)
DBPedia, http://dbpedia.org/
D2R Server publishing the DBLP Bibliography Database, http://www4.wiwiss.fu-berlin.de/dblp/
Gutiérrez, M.E., Kojima, I., Pahlevi, S.M., Corcho, Ó., Gómez-Pérez, A.: Accessing RDF(S) data resources in service-based grid infrastructures. Concurrency and Compututation: Practice and Experience 21(8), 1029–1051 (2009)
Kojima, I., Kimoto, M.: Implementation of a Service-based Grid Middleware for Accessing RDF Databases. In: Proceedings of Workshop on Semantic Extensions to Middleware: Enabling Large Scale Knowledge Applications (SEMELS 2009) (November 2009)
Linked Data - Connect Distributed Data across the Web, http://linkeddata.org/
Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.: SW-Store: a vertically partitioned DBMS for Semantic Web data management. VLDB J 18(2), 385–406 (2009)
Li, Q., Sha, M., Markl, V., Beyer, K., Colby, L., Lohman, G.: Adaptively Reordering Joins during Query Execution. In: Proc. ICDE, pp. 26–35. IEEE Computer Society, Los Alamitos (2007)
Lynden, S., Kojima, I., Matono, A., Tanimura, Y.: ADERIS: Adaptively integrating RDF data from SPARQL endpoints (Demo Paper). In: Proceedings of the Database Systems for Advanced Applications (DASFAA) Conference 2010 (2010) (to appear)
Ding, L., Finin, T., Joshi, A., Peng, Y., Cost, R.S., Sachs, J., Pang, R., Reddivari, P., Doshi, V.: Swoogle: A Semantic Web Search And Metadata Engine. In: 13th ACM Conference on Information and Knowledge Management (2004)
Harth, A., Umbrich, J., Hogan, A., Decker, S.: YARS2: A Federated Repository for Querying Graph Structured Data from the Web. In: Aberer, K., et al. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 211–224. Springer, Heidelberg (2007)
Newman, A., Li, Y.-F., Hunter, J.: Scalable Semantics, The Silver Lining of Cloud Computing. In: 4th IEEE International Conference on e-Science (e-Science 2008) (2008)
Tanimura, Y., Matono, A., Kojima, I., Sekiguchi, S.: Storage Scheme for Parallel RDF Database Processing Using Distributed File System and MapReduce. In: International Conference on High Performance Computing in the Asia Pacific Region (2009)
Liarou, E., Idreos, S., Koubarakis, M.: Continuous RDF Query Processing over DHTs. In: International Conference Semantic Web Computing (2007), http://iswc2007.semanticweb.org/papers/323.pdf
ARQ SPARQL query processing framework, http://jena.sourceforge.net/ARQ/
Carroll, J.J., Dickinson, I., Dollin, C., Seaborne, A., Wilkinson, K., Reynolds, D., Reynolds, D.: Jena: Implementing the semantic web recommendations. Technical Report HPL-2003-146, Hewlett Packard Laboratories (2004)
Quilitz, B., Leser, U.: Querying Distributed RDF Data Sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021. Springer, Heidelberg (2008)
Prudhommeaux, E.: Optimal RDF access to relational databases. Technical report, W3C (2005), http://www.w3.org/2004/04/30-RDF-RDB-access/
Langegger, A., Woss, A., Bloch, W.: A Semantic Web Middleware for Virtual Data Integration on the Web. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021. Springer, Heidelberg (2008)
RDFStats home (subproject of Semantic Web Integrator and Query Engine), http://semwiq.faw.uni-linz.ac.at/rdfstats/
The Friend of a Friend (FOAF) Project, http://www.foaf-project.org/
JOSEKI - A SPARQL Server for Jena, http://www.joseki.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lynden, S., Kojima, I., Matono, A., Tanimura, Y. (2010). Adaptive Integration of Distributed Semantic Web Data. In: Kikuchi, S., Sachdeva, S., Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2010. Lecture Notes in Computer Science, vol 5999. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12038-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-12038-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12037-4
Online ISBN: 978-3-642-12038-1
eBook Packages: Computer ScienceComputer Science (R0)