Skip to main content

Spatio-temporal top-k term search over sliding window

Abstract

In part due to the proliferation of GPS-equipped mobile devices, massive volumes of geo-tagged streaming text messages are becoming available on social media. It is of great interest to discover most frequent nearby terms from such tremendous stream data. In this paper, we present novel indexing, updating, and query processing techniques that are capable of discovering top-k most frequent nearby terms over a sliding window. Specifically, given a query location and a set of geo-tagged messages within a sliding window, we study the problem of searching for the top-k terms by considering term frequency, spatial proximity, and term freshness. We develop a novel and efficient mechanism to solve the problem, including a quad-tree based indexing structure, indexing update technique, and a best-first based searching algorithm. An empirical study is conducted to show that our proposed techniques are efficient and fit for users’ requirements through varying a number of parameters.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

References

  1. 1.

    Agarwal, P.K., Cormode, G., Huang, Z., Phillips, J.M., Wei, Z., Yi, K.: Mergeable summaries. ACM Trans. Database Syst. 38(4), 26,1–26,28 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  2. 2.

    Bansal, N., Koudas, N.: Blogscope: a system for online analysis of high volume text streams. In: VLDB, pp. 1410–1413 (2007)

  3. 3.

    Chen, L., Shang, S.: Approximate spatio-temporal top-k publish/subscribe. WWW J., online first: 1–23 (2018)

  4. 4.

    Chen, L., Shang, S., Zhang, Z., Cao, X., Jensen, C.S., Kalnis, P.: Location-aware top-k term publish/subscribe. In: ICDE, pp. 1–12 (2018)

  5. 5.

    Cong, G., Jensen, C.S., Wu, D.: Efficient retrieval of the top-k most relevant spatial Web objects. PVLDB 2(1), 337–348 (2009)

    Google Scholar 

  6. 6.

    Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. J. Algorithms 55(1), 58–75 (2005)

    MathSciNet  Article  MATH  Google Scholar 

  7. 7.

    Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: tracking most frequent items dynamically. ACM Trans. Database Syst. 30(1), 249–278 (2005)

    Article  Google Scholar 

  8. 8.

    Demaine, E.D., Lȯpez-Ortiz, A., Munro, J.I.: Frequency estimation of internet packet streams with limited space. In: ESA, pp. 348–360 (2002)

  9. 9.

    Efron, M., Golovchinsky, G.: Estimation methods for ranking recent information. In: SIGIR, pp. 495–504. ACM (2011)

  10. 10.

    Felipe, I.D., Hristidis, V., Rishe, N.: Keyword search on spatial databases. In: ICDE, pp. 656–665 (2008)

  11. 11.

    Guo, D., Zhu, Y., Xu, W., Shang, S., halls, Z. Ding.: How to find appropriate automobile exhibition Towards a personalized recommendation service for auto show. Neurocomputing 213, 95–101 (2016)

    Article  Google Scholar 

  12. 12.

    Han, J., Zheng, K., Sun, A., Shang, S., Wen, J.: Discovering neighborhood pattern queries by sample answers in knowledge base. In: ICDE, pp. 1014–1025 (2016)

  13. 13.

    Hu, S., Wen, J., Dou, Z., Shang, S.: Following the dynamic block on the Web. World Wide Web 19(6), 1077–1101 (2016)

    Article  Google Scholar 

  14. 14.

    Karp, R.M., Shenker, S., Papadimitriou, C.H.: A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Syst. 28, 51–55 (2003)

    Article  Google Scholar 

  15. 15.

    Li, Z., Lee, K.C.K., Zheng, B., Lee, W., Lee, D.L., Ir-tree, X. Wang.: An efficient index for geographic document search. IEEE Trans. Knowl. Data Eng. 23 (4), 585–599 (2011)

    Article  Google Scholar 

  16. 16.

    Li, Z., Shang, S., Xie, Q., Zhang, X.: Cost reduction for Web-based data imputation. In: DASFAA, pp. 438–452 (2014)

  17. 17.

    Liu, K., Yang, B., Shang, S., Li, Y., Ding, Z.: MOIR/UOTS: trip recommendation with user oriented trajectory search. In: MDM, pp. 335–337 (2013)

  18. 18.

    Liu, K., Li, Y., Dai, J., Shang, S., Zheng, K.: Compressing large scale urban trajectory data. In: CloudDP@EuroSys, pp. 3:1–3:6 (2014)

  19. 19.

    Liu, K., Li, Y., Ding, Z., Shang, S., Zheng, K.: Benchmarking big data for trip recommendation. In: ICCCN, pp. 1–6 (2014)

  20. 20.

    Liu, J., Zhao, K., Sommer, P., Shang, S., Kusy, B., Jurdak, R.: Bounded quadrant system: error-bounded trajectory compression on the go. In: ICDE, pp. 987–998 (2015)

  21. 21.

    Liu, J., Zhao, K., Sommer, P., Shang, S., Kusy, B., Lee, J., Jurdak, R.: A novel framework for online amnesic trajectory compression in resource-constrained environments. IEEE Trans. Knowl. Data Eng. 28(11), 2827–2841 (2016)

    Article  Google Scholar 

  22. 22.

    Liu, A., Wang, W., Shang, S., Li, Q., Zhang, X.: Efficient task assignment in spatial crowdsourcing with worker and task privacy protection. GeoInformatica, online first: 1–28 (2017)

  23. 23.

    Liu, A., Shen, X., Li, Z., Liu, G., Xu, J., Zhao, L., Zheng, K., Shang, S.: Differential private collaborative Web services qos prediction. WWW J., online first: 1–24 (2018)

  24. 24.

    Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. PVLDB 5(12), 1699 (2012)

    Google Scholar 

  25. 25.

    Metwally, A., Agrawal, D., El Abbadi, A.: Efficient computation of frequent and top-k elements in data streams. In: ICDT, pp. 398–412 (2005)

  26. 26.

    Metwally, A., Agrawal, D., El Abbadi, A.: An integrated efficient solution for computing frequent and top-k elements in data streams. ACM Trans. Database Syst. 31(3), 1095–1133 (2006)

    Article  Google Scholar 

  27. 27.

    Misra, J., Gries, D.: Finding repeated elements. Sci. Comput. Program. 2(2), 143–152 (1982)

    MathSciNet  Article  MATH  Google Scholar 

  28. 28.

    Ozsoy, M.G., Onal, K.D., Altingovde, I.S.: Result diversification for tweet search. In: WISE, pp. 78–89 (2014)

  29. 29.

    Rocha-Junior, J.B., Gkorgkas, O., Jonassen, S., Nørvåg, K.: Efficient processing of top-k spatial keyword queries. In: SSTD, pp. 205–222 (2011)

  30. 30.

    Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twitterstand: news in tweets. In: SIGSPATIAL, pp. 42–51 (2009)

  31. 31.

    Shang, S., Deng, K., Xie, K.: Best point detour query in road networks. In: ACM SIGSPATIAL, pp. 71–80 (2010)

  32. 32.

    Shang, S., Yuan, B., Deng, K., Xie, K., Zhou, X.: Finding the most accessible locations: reverse path nearest neighbor query in road networks. In: ACM SIGSPATIAL, pp. 181–190 (2011)

  33. 33.

    Shang, S., Yuan, B., Deng, K., Xie, K., Zheng, K., Zhou, X.: PNN query processing on compressed trajectories. GeoInformatica 16(3), 467–496 (2012)

    Article  Google Scholar 

  34. 34.

    Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. In: EDBT, pp. 156–167 (2012)

  35. 35.

    Shang, S., Lu, H., Pedersen, T.B., Xie, X.: Finding traffic-aware fastest paths in spatial networks. In: SSTD, pp. 128–145 (2013)

  36. 36.

    Shang, S., Lu, H., Pedersen, T.B., Xie, X.: Modeling of traffic-aware travel time in spatial networks. In: MDM, pp. 247–250 (2013)

  37. 37.

    Shang, S., Ding, R., Zheng, K., Jensen, C.S., Kalnis, P., Zhou, X.: Personalized trajectory matching in spatial networks, vol. 23 (2014)

  38. 38.

    Shang, S., Liu, J., Zheng, K., Lu, H., Pedersen, T.B., Wen, J.: Planning unobstructed paths in traffic-aware spatial networks. GeoInformatica 19(4), 723–746 (2015)

    Article  Google Scholar 

  39. 39.

    Shang, S., Zheng, K., Jensen, C.S., Yang, B., Kalnis, P., Li, G., Wen, J.: Discovery of path nearby clusters in spatial networks. IEEE Trans. Knowl. Data Eng. 27(6), 1505–1518 (2015)

    Article  Google Scholar 

  40. 40.

    Shang, S., Guo, D., Liu, J., Zheng, K., Wen, J.: Finding regions of interest using location based social media. Neurocomputing 173, 118–123 (2016)

    Article  Google Scholar 

  41. 41.

    Shang, S., Chen, L., Wei, Z., Guo, D., Wen, J.: Dynamic shortest path monitoring in spatial networks. J. Comput. Sci. Technol. 31(4), 637–648 (2016)

    Article  Google Scholar 

  42. 42.

    Shang, S., Chen, L., Wei, Z., Jensen, C.S., Wen, J., Kalnis, P.: Collective travel planning in spatial networks, vol. 28 (2016)

  43. 43.

    Shang, S., Zhu, S., Guo, D., Lu, M.: Discovery of probabilistic nearest neighbors in traffic-aware spatial networks. World Wide Web 20(5), 1135–1151 (2017)

    Article  Google Scholar 

  44. 44.

    Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Trajectory similarity join in spatial networks. PVLDB 10(11), 1178–1189 (2017)

    Google Scholar 

  45. 45.

    Shang, S., Chen, L., Jensen, C.S., Wen, J., Kalnis, P.: Searching trajectories by regions of interest, vol. 29 (2017)

  46. 46.

    Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Parallel trajectory similarity joins in spatial networks. VLDB J., online first: 1–26 (2018)

  47. 47.

    Skovsgaard, A., Sidlauskas, D., Jensen, C.S.: Scalable top-k spatio-temporal term querying. In: ICDE, pp. 148–159 (2014)

  48. 48.

    Teitler, B.E., Lieberman, M.D., Panozzo, D., Sankaranarayanan, J., Samet, H., Sperling, J.: Newsstand: a new view on news. In: SIGSPATIAL, pp. 18 (2008)

  49. 49.

    Wang, Y., Li, J., Zhong, Y., Zhu, S., Guo, D., Shang, S.: Discovery of accessible locations using region-based geo-social data. WWW J., online first: 1–18 (2018)

  50. 50.

    Wei, Z., Liu, X., Li, F., Shang, S., Du, X., Wen, J.: Matrix sketching over sliding windows. In: SIGMOD, pp. 1465–1480 (2016)

  51. 51.

    Wu, S., Lin, H., Hu, L., Gao, Y., Lu, D.: Finding frequent items in time decayed data streams. In: APWeb, pp. 17–29 (2016)

  52. 52.

    Xie, K., Deng, K., Shang, S., Zhou, X., Zheng, K.: Finding alternative shortest paths in spatial networks. ACM Trans. Database Syst. 37(4), 29,1–29,31 (2012)

    Article  Google Scholar 

  53. 53.

    Xie, Q., Shang, S., Yuan, B., Pang, C., Zhang, X.: Local correlation detection with linearity enhancement in streaming data. In: CIKM, pp. 309–318 (2013)

  54. 54.

    Xie, X., Lu, H., Chen, J., Shang, S.: Top-k neighborhood dominating query. In: DASFAA, pp. 131–145 (2013)

  55. 55.

    Xu, Y., Chen, L., Yao, B., Shang, S., Zhu, S., Zheng, K., Li, F.: Location-based top-k term querying over sliding window. In: WISE, pp. 299–314 (2017)

  56. 56.

    Yang, B., Guo, C., Jensen, C.S., Kaul, M., Shang, S.: Stochastic skyline route planning under time-varying uncertainty. In: ICDE, pp. 136–147 (2014)

  57. 57.

    Yao, B., Chen, Z., Gao, X., Shang, S., Ma, S., Guo, M.: Flexible aggregate nearest neighbor queries in road networks. In: ICDE, pp. 1–12 (2018)

  58. 58.

    Yao, B., Zheng, W., Wang, Z., Chen, Z., Shang, S., Zheng, K., Guo, M.: Distributed in-memory analytics for big temporal data. In: DASFAA, pp. 1–16 (2018)

  59. 59.

    Zhang, C., Zhang, Y., Zhang, W., Lin, X.: Inverted linear quadtree: Efficient top k spatial keyword search. In: ICDE, pp. 901–912 (2013)

  60. 60.

    Zhang, D., Tan, K., Tung, A.K.H.: Scalable top-k spatial keyword search. In: EDBT, pp. 359–370 (2013)

  61. 61.

    Zhang, D., Chan, C., Tan, K.: Processing spatial keyword query as a top-k aggregation query. In: SIGIR, pp. 355–364 (2014)

  62. 62.

    Zhao, K., Chen, L., Cong, G.: Topic exploration in spatio-temporal document collections. In: SIGMOD, pp. 985–998 (2016)

  63. 63.

    Zheng, K., Shang, S., Yuan, N.J., Yang, Y.: Towards efficient search for activity trajectories. In: ICDE, pp. 230–241 (2013)

  64. 64.

    Zheng, K., Zheng, Y., Yuan, N.J., Shang, S.: On discovery of gathering patterns from trajectories. In: ICDE, pp. 242–253 (2013)

  65. 65.

    Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26(8), 1974–1988 (2014)

    Article  Google Scholar 

  66. 66.

    Zheng, K., Su, H., Zheng, B., Shang, S., Xu, J., Liu, J., Zhou, X.: Interactive top-k spatial keyword queries. In: ICDE, pp. 423–434 (2015)

  67. 67.

    Zheng, B., Wang, H., Zheng, K., Su, H., Liu, K., Shang, S.: Sharkdb: An in-memory column-oriented storage for trajectory analysis. World Wide Web 21(2), 455–485 (2018)

    Article  Google Scholar 

  68. 68.

    Zhu, S., Wang, Y., Shang, S., Zhao, G., Wang, J.: Probabilistic routing using multimodal data. Neurocomputing 253, 49–55 (2017)

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Shuo Shang.

Additional information

This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2017

Guest Editors: Lu Chen and Yunjun Gao

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Shang, S., Yao, B. et al. Spatio-temporal top-k term search over sliding window. World Wide Web 22, 1953–1970 (2019). https://doi.org/10.1007/s11280-018-0606-x

Download citation

Keywords

  • Top-k
  • Term
  • Spatial
  • Temporal