Advertisement

Top-k relevant semantic place retrieval on spatiotemporal RDF data

  • 56 Accesses

Abstract

RDF data are traditionally accessed using structured query languages, such as SPARQL. However, this requires users to understand the language as well as the RDF schema. Keyword search on RDF data aims at relieving users from these requirements; users only input a set of keywords, and the goal is to find small RDF subgraphs that contain all keywords. At the same time, popular RDF knowledge bases also include spatial and temporal semantics, which opens the road to spatiotemporal-based search operations. In this work, we propose and study novel keyword-based search queries with spatial semantics on RDF data, namely kSP queries. The objective of the kSP query is to find RDF subgraphs which contain the query keywords and are rooted at spatial entities close to the query location. To add temporal semantics to the kSP query, we propose the kSPT query that uses two ways to incorporate temporal information. One way is considering the temporal differences between the keyword-matched vertices and the query timestamp. The other way is using a temporal range to filter keyword-matched vertices. The novelty of kSP and kSPT queries is that they are spatiotemporal-aware and that they do not rely on the use of structured query languages. We design an efficient approach containing two pruning techniques and a data preprocessing technique for the processing of kSP queries. The proposed approach is extended and improved with four optimizations to evaluate kSPT queries. Extensive empirical studies on two real datasets demonstrate the superior and robust performance of our proposals compared to baseline methods.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Notes

  1. 1.

    Disk-based graph representations for RDF data (e.g., [67]) can also be used for larger-scale data.

  2. 2.

    If multiple trees rooted at p have the same minimum looseness, we can: (1) break ties arbitrarily and select one of them to be the TQSP for p or (2) keep all trees with the same minimum looseness in a set. If we use option (2), the result of a kSP query would the top-k qualified semantic place sets. The methods proposed in this paper are applicable for both options. For the ease of presentation, we adopt option (1) in the rest of the paper.

  3. 3.

    The temporal difference could be measured in days, minutes, etc.

References

  1. 1.

    Alternative fueling station locator. http://www.afdc.energy.gov/locator/stations/

  2. 2.

    Crime in chicagoland. http://crime.chicagotribune.com/

  3. 3.

    Data.gov. http://www.data.gov

  4. 4.

    Dbpedia. http://wiki.dbpedia.org

  5. 5.

    Hospital compare. http://health.data.gov/def/cqld

  6. 6.

    Owlim-se. http://owlim.ontotext.com/display/OWLIMv43/OWLIM-SE

  7. 7.

    Parliament. http://parliament.semwebcentral.org

  8. 8.

    Patients like me. www.patientslikeme.com

  9. 9.

    Spot crime. http://www.spotcrime.com/

  10. 10.

    Virtuoso. http://virtuoso.openlinksw.com

  11. 11.

    Yago. http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/

  12. 12.

    Agrawal, S., Chaudhuri, S., Das, G.: Dbxplorer: a system for keyword-based search over relational databases. In: ICDE, pp. 5–16 (2002)

  13. 13.

    Battle, R., Kolas, D.: Enabling the geospatial semantic web with parliament and geosparql. Semant. Web 3(4), 355–370 (2012)

  14. 14.

    Bikakis, N., Giannopoulos, G., Liagouris, J., Skoutas, D., Dalamagas, T., Sellis, T.: Rdivf: diversifying keyword search on RDF graphs. In: TPDL, pp. 413–416 (2013)

  15. 15.

    Brodt, A., Nicklas, D., Mitschang, B.: Deep integration of spatial query processing into native RDF triple stores. In: SIGSPATIAL, pp. 33–42 (2010)

  16. 16.

    Cappellari, P., Virgilio, R.D., Maccioni, A., Roantree, M.: A path-oriented RDF index for keyword search query processing. In: DEXA, pp. 366–380 (2011)

  17. 17.

    Cheng, J., Huang, S., Wu, H., Fu, A.W.: Tf-label: a topological-folding labeling scheme for reachability querying in a large graph. In: SIGMOD, pp. 193–204 (2013)

  18. 18.

    Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: Xsearch: a semantic search engine for XML. In: VLDB, pp. 45–56 (2003)

  19. 19.

    Dalvi, B.B., Kshirsagar, M., Sudarshan, S.: Keyword search on external memory data graphs. PVLDB 1(1), 1189–1204 (2008)

  20. 20.

    Elbassuoni, S., Blanco, R.: Keyword search over RDF graphs. In: CIKM, pp. 237–242 (2011)

  21. 21.

    Elbassuoni, S., Ramanath, M., Schenkel, R., Weikum, G.: Searching RDF graphs with SPARQL and keywords. IEEE Data Eng. Bull. 33(1), 16–24 (2010)

  22. 22.

    Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS (2001)

  23. 23.

    Fu, H., Anyanwu, K.: Effectively interpreting keyword queries on RDF databases with a rear view. In: ISWC, pp. 193–208 (2011)

  24. 24.

    Giannopoulos, G., Biliri, E., Sellis, T.: Personalizing keyword search on RDF data. In: TPDL, pp. 272–278 (2013)

  25. 25.

    Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: ranked keyword search over XML documents. In: SIGMOD, pp. 16–27 (2003)

  26. 26.

    Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: SIGMOD, pp. 47–57 (1984)

  27. 27.

    Halaschek-Wiener, C., Aleman-Meza, B., Arpinar, I.B., Sheth, A.P.: Discovering and ranking semantic associations over a large RDF metabase. In: VLDB, pp. 1317–1320 (2004)

  28. 28.

    Han, S., Zou, L., Yu, J.X., Zhao, D.: Keyword search on RDF graphs: a query graph assembly approach. In: CIKM, pp. 227–236 (2017)

  29. 29.

    He, H., Wang, H., Yang, J., Yu, P.S.: BLINKS: ranked keyword searches on graphs. In: SIGMOD, pp. 305–316 (2007)

  30. 30.

    Hendler, J.A., Holm, J., Musialek, C., Thomas, G.: US government linked open data: semantic.data.gov. IEEE Intell. Syst. 27(3), 25–31 (2012)

  31. 31.

    Hjaltason, G.R., Samet, H.: Distance browsing in spatial databases. ACM Trans. Database Syst. 24(2), 265–318 (1999)

  32. 32.

    Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)

  33. 33.

    Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient IR-style keyword search over relational databases. In: VLDB, pp. 850–861 (2003)

  34. 34.

    Hristidis, V., Papakonstantinou, Y.: DISCOVER: keyword search in relational databases. In: VLDB, pp. 670–681 (2002)

  35. 35.

    Inglis, J.: Inverted indexes and multi-list structures. Comput. J. 17(1), 59–63 (1974)

  36. 36.

    Jiang, H., Wang, H., Yu, P.S., Zhou, S.: Gstring: a novel approach for efficient search in graph databases. In: ICDE, pp. 566–575 (2007)

  37. 37.

    Jin, R., Ruan, N., Dey, S., Yu, J.X.: SCARAB: scaling reachability computation on large graphs. In: SIGMOD, pp. 169–180 (2012)

  38. 38.

    Jin, R., Ruan, N., Xiang, Y., Wang, H.: Path-tree: an efficient reachability indexing scheme for large directed graphs. ACM Trans. Database Syst. 36(1), 7 (2011)

  39. 39.

    Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: VLDB, pp. 505–516 (2005)

  40. 40.

    Koubarakis, M., Kyzirakos, K.: Modeling and querying metadata in the semantic sensor web: the model stRDF and the query language stSPARQL. In: ESWC, pp. 425–439 (2010)

  41. 41.

    Kyzirakos, K., Karpathiotakis, M., Koubarakis, M.: Strabon: a semantic geospatial DBMS. In: ISWC, pp. 295–311 (2012)

  42. 42.

    Le, W., Li, F., Kementsietsidis, A., Duan, S.: Scalable keyword search on large RDF data. TKDE 26(11), 2774–2788 (2014)

  43. 43.

    Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: KDD, pp. 631–636 (2006)

  44. 44.

    Liagouris, J., Mamoulis, N., Bouros, P., Terrovitis, M.: An effective encoding scheme for spatial RDF data. PVLDB 7(12), 1271–1282 (2014)

  45. 45.

    Lian, X., Hoyos, E.D., Chebotko, A., Fu, B., Reilly, C.: K-nearest keyword search in RDF graphs. J. Web Sem. 22, 40–56 (2013)

  46. 46.

    Libkin, L., Reutter, J.L., Soto, A., Vrgoc, D.: Trial: a navigational algebra for RDF triplestores. ACM Trans. Database Syst. 43(1), 5:1–5:46 (2018)

  47. 47.

    Lin, X., Ma, Z., Yan, L.: RDF keyword search using a type-based summary. J. Inf. Sci. Eng. 34(2), 489–504 (2018)

  48. 48.

    Liu, Z., Wang, C., Chen, Y.: Keyword search on temporal graphs. In: ICDE, pp. 1807–1808 (2018)

  49. 49.

    Neumann, T., Weikum, G.: RDF-3X: a risc-style engine for RDF. PVLDB 1(1), 647–659 (2008)

  50. 50.

    Papadias, D., Zhang, J., Mamoulis, N., Tao, Y.: Query processing in spatial network databases. In: VLDB, pp. 802–813 (2003)

  51. 51.

    Peng, P., Zou, L., Qin, Z.: Answering top-k query combined keywords and structural queries on RDF graphs. Inf. Syst. 67, 19–35 (2017)

  52. 52.

    Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR, pp. 275–281 (1998)

  53. 53.

    Prud’Hommeaux, E., Seaborne, A., et al.: Sparql query language for rdf. W3C Recomm. 15 (2008). https://www.w3.org/TR/rdfsparql-query

  54. 54.

    Shasha, D., Wang, J.T.L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS, pp. 39–52 (2002)

  55. 55.

    Shi, J., Wu, D., Mamoulis, N.: Top-k relevant semantic place retrieval on spatial RDF data. In: SIGMOD, pp. 1977–1990 (2016)

  56. 56.

    Tran, T., Wang, H., Rudolph, S., Cimiano, P.: Top-k exploration of query candidates for efficient keyword search on graph-shaped (RDF) data. In: ICDE, pp. 405–416 (2009)

  57. 57.

    van Schaik, S.J., de Moor, O.: A memory efficient reachability data structure through bit vector compression. In: SIGMOD, pp. 913–924 (2011)

  58. 58.

    Wang, C., Ku, W., Chen, H.: Geo-store: a spatially-augmented SPARQL query evaluation system. In: SIGSPATIAL, pp. 562–565 (2012)

  59. 59.

    Wang, D., Zou, L., Feng, Y., Shen, X., Tian, J., Zhao, D.: S-store: an engine for large RDF graph integrating spatial information. In: DASFAA, pp. 31–47 (2013)

  60. 60.

    Wang, D., Zou, L., Zhao, D.: GST-store: an engine for large RDF graph integrating spatiotemporal information. In: EDBT, pp. 652–655 (2014)

  61. 61.

    Wang, H., Aggarwal, C.C.: A survey of algorithms for keyword search on graph data. In: Managing and Mining Graph Data, pp. 249–273 (2010)

  62. 62.

    Wylot, M., Hauswirth, M., Cudré-Mauroux, P., Sakr, S.: RDF data storage and query processing schemes: a survey. ACM Comput. Surv. 51(4), 84:1–84:36 (2018)

  63. 63.

    Yan, X., Yu, P.S., Han, J.: Substructure similarity search in graph databases. In: SIGMOD, pp. 766–777 (2005)

  64. 64.

    Yildirim, H., Chaoji, V., Zaki, M.J.: GRAIL: scalable reachability index for large graphs. PVLDB 3(1), 276–284 (2010)

  65. 65.

    Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A distributed graph engine for web scale RDF data. PVLDB 6(4), 265–276 (2013)

  66. 66.

    Zhong, M., Wang, Y., Zhu, Y.: Coverage-oriented diversification of keyword search results on graphs. In: DASFAA, pp. 166–183 (2018)

  67. 67.

    Zou, L., Mo, J., Chen, L., Özsu, M.T., Zhao, D.: gStore: answering SPARQL queries via subgraph matching. PVLDB 4(8), 482–493 (2011)

Download references

Author information

Correspondence to Jieming Shi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by Grant No. 2019A1515011721 from Natural Science Foundation of Guangdong, China and by Grant No. 61502310 from National Natural Science Foundation of China and by Grant No. 17253616 from Hong Kong RGC.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wu, D., Zhou, H., Shi, J. et al. Top-k relevant semantic place retrieval on spatiotemporal RDF data. The VLDB Journal (2019). https://doi.org/10.1007/s00778-019-00591-8

Download citation

Keywords

  • Semantic place
  • RDF data
  • Spatiotemporal data