Using Reformulation Trees to Optimize Queries over Distributed Heterogeneous Sources

  • Yingjie Li
  • Jeff Heflin
Conference paper

DOI: 10.1007/978-3-642-17746-0_32

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6496)
Cite this paper as:
Li Y., Heflin J. (2010) Using Reformulation Trees to Optimize Queries over Distributed Heterogeneous Sources. In: Patel-Schneider P.F. et al. (eds) The Semantic Web – ISWC 2010. ISWC 2010. Lecture Notes in Computer Science, vol 6496. Springer, Berlin, Heidelberg

Abstract

In order to effectively and quickly answer queries in environments with distributed RDF/OWL, we present a query optimization algorithm to identify the potentially relevant Semantic Web data sources using structural query features and a term index. This algorithm is based on the observation that the join selectivity of a pair of query triple patterns is often higher than the overall selectivity of these two patterns treated independently. Given a rule goal tree that expresses the reformulation of a conjunctive query, our algorithm uses a bottom-up approach to estimate the selectivity of each node. It then prioritizes loading of selective nodes and uses the information from these sources to further constrain other nodes. Finally, we use an OWL reasoner to answer queries over the selected sources and their corresponding ontologies. We have evaluated our system using both a synthetic data set and a subset of the real-world Billion Triple Challenge data.

Keywords

information integration query optimization query reformulation source selectivity 
Download to read the full conference paper text

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yingjie Li
    • 1
  • Jeff Heflin
    • 1
  1. 1.Department of Computer Science and EngineeringLehigh UniversityBethlehemU.S.A.

Personalised recommendations