Advertisement

Top-k Spatio-textual Similarity Search

  • Sitong Liu
  • Yaping Chu
  • Huiqi Hu
  • Jianhua Feng
  • Xuan Zhu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8485)

Abstract

Location-based services have attracted significant attention for the ubiquitous smartphones equipped with GPS systems. These services (e.g., Twitter, Foursquare) generate large amounts of spatio-textual data which contain both geographical location and textual description. In this paper, we study a prevalent top-k spatio-textual similarity search problem: Given a set of objects and a user query, find k most relevant objects considering both spatial location and textual description. We make the following contributions: (1) We propose a TA-based framework and devise efficient algorithms to incrementally visit the objects with current highest spatial or textual similarity. (2) We explore a hybrid partition pattern by integrating spatial and textual pruning power. We further propose a partition-based algorithm which can significantly improve the performance. (3) We have conducted extensive experiments on real and synthetic datasets. Experimental results show that our methods outperform state-of-the-art algorithms and achieve high performance.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cong, G., Jensen, C.S., Wu, D.: Efficient retrieval of the top-k most relevant spatial web objects. In: PVLDB (2009)Google Scholar
  2. 2.
    Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS (2001)Google Scholar
  3. 3.
    Roussopoulos, N., Kelley, S., Vincent, F.: Nearest neighbor queries. In: SIGMOD Conference, pp. 71–79 (1995)Google Scholar
  4. 4.
    Katayama, N., Satoh, S.: The sr-tree: An index structure for high-dimensional nearest neighbor queries. In: SIGMOD Conference, pp. 369–380 (1997)Google Scholar
  5. 5.
    Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: SIGMOD Conference, pp. 47–57 (1984)Google Scholar
  6. 6.
    Arasu, A., Ganti, V., Kaushik, R.: Efficient exact set-similarity joins. In: VLDB, pp. 918–929 (2006)Google Scholar
  7. 7.
    Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4) (2008)Google Scholar
  8. 8.
    Bayardo, R.J., Ma, Y., Srikant, R.: Scaling up all pairs similarity search. In: WWW, pp. 131–140 (2007)Google Scholar
  9. 9.
    Xiao, C., Wang, W., Lin, X., Shang, H.: Top-k set similarity joins. In: ICDE, pp. 916–927 (2009)Google Scholar
  10. 10.
    Xiao, C., Wang, W., Lin, X., Yu, J.X.: Efficient similarity joins for near duplicate detection. In: WWW, pp. 131–140 (2008)Google Scholar
  11. 11.
    Zhou, Y., Xie, X., Wang, C., Gong, Y., Ma, W.Y.: Hybrid index structures for location-based web search. In: CIKM, pp. 155–162 (2005)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sitong Liu
    • 1
  • Yaping Chu
    • 1
  • Huiqi Hu
    • 1
  • Jianhua Feng
    • 1
  • Xuan Zhu
    • 2
  1. 1.Department of Computer Science and TechnologyTsinghua UniversityChina
  2. 2.Samsung R&D instituteBeijingChina

Personalised recommendations