Abstract
The web is increasingly being accessed from geo-positioned devices such as smartphones, and rapidly increasing volumes of web content are geo-tagged. In addition, studies show that a substantial fraction of all web queries has local intent. This development motivates the study of advanced spatial keyword-based querying of web content. Previous research has primarily focused on the retrieval of the top-k individual spatial web objects that best satisfy a query specifying a location and a set of keywords. This paper proposes a new type of query functionality that returns top-k groups of objects while taking into account aspects such as group density, distance to the query, and relevance to the query keywords. To enable efficient processing, novel indexing and query processing techniques for single and multiple keyword queries are proposed. Empirical performance studies with an implementation of the techniques and real data suggest that the proposals are viable in practical settings.
Similar content being viewed by others
References
Amitay, E., Har’El, N., Sivan, R., Soffer, A.: Web-a-where: geotagging web content. In: SIGIR, 273–280 (2004)
Bøgh, K., Skovsgaard, A., Jensen, C.S.: Groupfinder: a new approach to top-k point-of-interest group retrieval. PVLDB 6(12), 1226–1229 (2013)
Cao, X., Chen, L., Cong, G., Jensen, C.S., Qu, Q., Skovsgaard, A., Wu, D., Yiu, M. L.: Spatial keyword querying. In: Atzeni P., Cheung, D., Ram S. (eds.) Conceptual Modeling. Proceedings of the 31st International Conference ER 2012, Florence, Italy, October 15–18, 2012. Lecture Notes in Computer Science, vol. 7532, pp 16–29. Springer, Berlin, Heidelberg (2012)
Cao, X., Cong, G., Jensen, C.S.: Retrieving top-k prestige-based relevant spatial web objects. PVLDB 3(1–2), 373–384 (2010)
Cao, X., Cong, G., Jensen, C. S., Ooi, B. C.: Collective spatial keyword querying. In: SIGMOD, pp. 373–384 (2011)
Chen, L., Cong, G., Jensen, C.S., Wu, D.: Spatial keyword query processing: an experimental evaluation. PVLDB 6(3), 217–228 (2013)
Cong, G., Jensen, C.S., Wu, D.: Efficient retrieval of the top-k most relevant spatial web objects. PVLDB 2(1), 337–348 (2009)
De Felipe, I., Hristidis, V., Rishe, N.: Keyword search on spatial databases. In: ICDE, pp. 656–665 (2008)
Ding, J., Gravano, L., Shivakumar, N.: Computing geographical scopes of web resources. In: VLDB, pp. 545–556 (2000)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96, 226–231 (1996)
Google Inc., Google Maps API (2012)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. SIGMOD Rec. 14(2), 47–57 (1984)
Hariharan, R., Hore, B., Li, C., Mehrotra, S.: Processing spatial-keyword (SK) queries in geographic information retrieval (GIR) systems. In: SSDBM, p. 16 (2007)
Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. Appl. Stat. 28(1), 100–108 (1979)
Ho, C.-T., Agrawal, R., Megiddo, N., Srikant, R.: Range queries in OLAP data cubes. SIGMOD Rec. 26(2), 73–88 (1997)
Jurgens, M., Lenz, H.-J.: The Ra*-tree: an improved R*-tree with materialized data for supporting range queries on OLAP-data. In: DEXA, pp. 186–191 (1998)
Lazaridis, I., Mehrotra, S.: Progressive approximate aggregate queries with a multi-resolution tree structure. SIGMOD Rec. 30(2), 401–412 (2001)
Li, G., Feng, J., Xu, J.: DESKS: Direction-aware spatial keyword search. In: ICDE, pp. 474–485 (2012)
Li, Z., Lee, K., Zheng, B., Lee, W.-C., Lee, D.L., Wang, X.: IR-tree: an efficient index for geographic document search. TKDE 23(4), 585–599 (2011)
Long, C., Wong, R.C.-W., Wang, K., Fu, A.W.-C.: Collective spatial keyword queries: a distance owner-driven approach. In: SIGMOD, pp. 689–700 (2013)
McCurley, K.S.: Geospatial mapping and navigation of the web. WWW, pp. 221–229 (2001)
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR, pp. 275–281 (1998)
Rocha-Junior, J.a.B., Gkorgkas, O., Jonassen, S., Nørvåg, K.: Efficient processing of top-k spatial keyword queries. In: SSTD, pp. 205–222 (2011)
Srivastava, J., Tan, J., Lum, V.: TBSAM: an access method for efficient processing of statistical queries. TKDE 1(4), 414–423 (1989)
Tao, Y., Papadias, D.: Range aggregate processing in spatial databases. TKDE 16(12), 1555–1570 (2004)
Wu, D., Cong, G., Jensen, C.: A framework for efficient spatial web object retrieval. In: VLDBJ, Online First, p. 26 (2012)
Wu, D., Yiu, M.L., Jensen, C.S. , Cong, G.: Efficient continuously moving top-k spatial keyword query processing. In: ICDE, pp. 541–552 (2011)
Zhang, D., Chee, Y.M., Mondal, A., Tung, A., Kitsuregawa, M.: Keyword search in spatial databases: towards searching by document. In: ICDE, pp. 688–699 (2009)
Zhang, D., Ooi, B.C., Tung, A.: Locating mapped resources in web 2.0. In: ICDE, pp. 521–532 (2010)
Zhou, Y., Xie, X., Wang, C., Gong, Y., Ma, W.-Y.: Hybrid index structures for location-based web search. In: CIKM, pp. 155–162 (2005)
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comp. Surv. 38(2), article no. 6 (2006). doi:10.1145/1132956.1132959
Acknowledgments
This research was supported in part by the European Union Seventh Framework Programme—Marie Curie Actions, Initial Training Network Geocrowd (http://www.geocrowd.eu) under Grant Agreement No. FP7-PEOPLE-2010-ITN-264994.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Skovsgaard, A., Jensen, C.S. Finding top-k relevant groups of spatial web objects. The VLDB Journal 24, 537–555 (2015). https://doi.org/10.1007/s00778-015-0388-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-015-0388-z