Skip to main content
Log in

Approximate querying of RDF graphs via path alignment

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

A query over RDF data is usually expressed in terms of matching between a graph representing the target and a huge graph representing the source. Unfortunately, graph matching is typically performed in terms of subgraph isomorphism, which makes semantic data querying a hard problem. In this paper we illustrate a novel technique for querying RDF data in which the answers are built by combining paths of the underlying data graph that align with paths specified by the query. The approach is approximate and generates the combinations of the paths that best align with the query. We show that, in this way, the complexity of the overall process is significantly reduced and verify experimentally that our framework exhibits an excellent behavior with respect to other approaches in terms of both efficiency and effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. http://www.govtrack.us.

  2. A prototype application is available at https://www.dropbox.com/sh/d5u1u24qnyqg18f/7Oefq8-qVa.

  3. A prototype application is available at https://www.dropbox.com/sh/d5u1u24qnyqg18f/7Oefq8-qVa.

  4. http://lucene.apache.org/.

  5. http://www-personal.umich.edu/~mejn/netdata/.

  6. At https://www.dropbox.com/sh/d5u1u24qnyqg18f/7Oefq8-qVa you can find the complete set of queries.

References

  1. De Virgilio, R., Giunchiglia, F., Tanca, L. (eds.): Semantic Web Information Management—A Model-Based Perspective. Springer, Berlin (2010)

    MATH  Google Scholar 

  2. De Virgilio, R., Guerra, F., Velegrakis, Y. (eds.): Semantic Search Over the Web. Springer, Berlin, Heidelberg (2012)

    Google Scholar 

  3. De Virgilio, R., Orsi, G., Tanca, L., Torlone, R.: Nyaya: A system supporting the uniform management of large sets of semantic data. In: ICD., pp. 1309–1312. (2012)

  4. Bröcheler, M., Pugliese, A., Subrahmanian, V.S.: Dogma: A disk-oriented graph matching algorithm for rdf databases. In: ISWC, pp. 97–113. (2009)

  5. Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractable to polynomial time. Proc. VLDB Endow. 3(1), 264–275 (2010)

    Article  Google Scholar 

  6. Zhang, S., Yang, J., Jin, W.: Sapper: subgraph indexing and approximate matching in large graphs. Proc. VLDB Endow. 3(1), 1185–1194 (2010)

    Article  MATH  Google Scholar 

  7. Wood, P.T.: Query languages for graph databases. SIGMOD Rec. 41(1), 50–60 (2012)

    Article  Google Scholar 

  8. Gallagher, B.: Matching structure and semantics : A survey on graph-based pattern matching. In: Artificial Intelligence, pp. 45–53. (2006)

  9. Zhang, S., Li, S., Yang, J.: Gaddi: distance index based subgraph matching in biological networks. In: EDBT, pp. 192–203. (2009)

  10. Khan, A., Li, N., Yan, X., Guan, Z., Chakraborty, S., Tao, S.: Neighborhood based fast graph search in large networks. In: SIGMOD, pp. 901–912. (2011)

  11. Iordanov, B.: Hypergraphdb: A generalized graph database. In: WAIM Workshops, pp. 25–36. (2010)

  12. Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. The MIT Press, Cambridge (1998)

  13. Hassanzadeh, O., Consens, M.P.: Linked movie data base (triplification challenge report). In: I-SEMANTICS, pp. 194–196 (2008)

  14. Bizer, C., Schultz, A.: The Berlin sparql benchmark. Int. J. Semant. Web. Inf. Syst. 5(2), 1–24 (2009)

    Article  Google Scholar 

  15. Guo, Y., Pan, Z., Heflin, J.: Lubm: a benchmark for owl knowledge base systems. J. Web. Semant. 3(2–3), 158–182 (2005)

    Article  Google Scholar 

  16. Ma, L., Yang, Y., Qiu, Z., Xie, G.T., Pan, Y., Liu, S.: Towards a complete owl ontology benchmark. In: ESWC, pp. 125–139. (2006)

  17. Cappellari, P., De Virgilio, R., Maccioni, A., Roantree, M.: A path-oriented rdf index for keyword search query processing. In: DEXA, pp. 366–380. (2011)

  18. Zou, L., Chen, L., Özsu, M.T.: Distance-join: pattern match query in a large graph database. Proc. VLDB Endow. 2(1), 886–897 (2009)

    Article  Google Scholar 

  19. Fan, W., Bohannon, P.: Information preserving xml schema embedding. ACM Trans. Database Syst. 33(1) (2008)

  20. Tran, T., Wang, H., Rudolph, S., Cimiano, P.: Top-k exploration of query candidates for efficient keyword search on graph-shaped (rdf) data. In: ICDE Conference, pp. 405–416 (2009)

  21. Neumann, T., Weikum, G.: x-rdf-3x: fast querying, high update rates, and consistency for rdf databases. Proc. VLDB Endow. 3(1), 256–263 (2010)

    Article  MATH  Google Scholar 

  22. Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD, pp. 335–346. (2004)

  23. Zhang, S., Hu, M., Yang, J.: Treepi: A novel graph indexing method. In: ICDE, pp. 966–975. (2007)

  24. Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: towards verification-free query processing on graph databases. In: SIGMOD, pp. 857–872. (2007)

  25. Tian, Y., Patel, J.M.: Tale: A tool for approximate large graph matching. In: ICDE, pp. 963–972. (2008)

  26. Zeng, Z., Tung, A.K.H., Wang, J., Feng, J., Zhou, L.: Comparing stars: on approximating graph edit distance. Proc. VLDB Endow. 2(1), 25–36 (2009)

    Article  Google Scholar 

  27. Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-hop: a high-compression indexing scheme for reachability query. In: SIGMOD, pp. 813–826. (2009)

  28. Poulovassilis, A., Wood, P.T.: Combining approximation and relaxation in semantic web path queries. In: ISWC, pp. 631–646. (2010)

  29. Chan, E.P.F., Lim, H.: Optimization and evaluation of shortest path queries. VLDB J. 16(3), 343–369 (2007)

    Article  MATH  Google Scholar 

  30. Hu, W., Jian, N., Qu, Y., Wang, Y.: Gmo: A graph matching for ontologies. In: Integrating Ontologies. (2005)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto De Virgilio.

Additional information

Communicated by Haixun Wang and Jeffrey Xu Yu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

De Virgilio, R., Maccioni, A. & Torlone, R. Approximate querying of RDF graphs via path alignment. Distrib Parallel Databases 33, 555–581 (2015). https://doi.org/10.1007/s10619-014-7142-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-014-7142-1

Keywords

Navigation