Skip to main content

Top-k Temporal Keyword Query over Social Media Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9931))

Abstract

Analytic jobs over social media data typically need to explore data of different periods. However, most existing keyword search work merely use creation time of items as the measurement of their recency. In this paper we propose top-k temporal keyword query that ranks data by their aggregate sum of shared times during the given time window. A query algorithm that can be executed over a general temporal inverted index is provided. The complexity analysis based on the power law distribution reveals the upper bound of accessed items. Furthermore, two-tiers structure and piecewise maximum approximation sketch are proposed as refinements. Extensive empirical studies on a reallife dataset show the combination of two refinements achieves remarkable performance improvement under different query settings.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Anand, A., Bedathur, S.J., Berberich, K., Schenkel, R.: Efficient temporal keyword search over versioned text. In: CIKM, pp. 699–708 (2010)

    Google Scholar 

  2. Arge, L., Vitter, J.S.: Optimal external memory interval management. SIAM J. Comput. 32(6), 1488–1508 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  3. Berberich, K., Bedathur, S., Neumann, T., Weikum, G.: A time machine for text search. In: SIGIR, p. 519 (2007)

    Google Scholar 

  4. Chen, C., Li, F., Ooi, B.C., Wu, S.: Ti: an efficient indexing mechanism for real-time search on tweets. In: SIGMOD Conference, pp. 649–660 (2011)

    Google Scholar 

  5. Fuchs, E., Gruber, T., Nitschke, J., Sick, B.: Online segmentation of time series based on polynomial least-squares approximations. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2232–2245 (2010)

    Article  Google Scholar 

  6. He, J., Suel, T.: Faster temporal range queries over versioned text. In: SIGIR, p. 565 (2011)

    Google Scholar 

  7. Huo, W., Tsotras, V.J.: A comparison of Top-k temporal keyword querying over versioned text collections. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012, Part II. LNCS, vol. 7447, pp. 360–374. Springer, Heidelberg (2012)

    Google Scholar 

  8. Jestes, J., Phillips, J.M., Li, F., Tang, M.: Ranking large temporal data. PVLDB 5(11), 1412–1423 (2012)

    Google Scholar 

  9. Keogh, E.J., Chu, S., Hart, D.M., Pazzani, M.J.: An online algorithm for segmenting time series. In: ICDM, pp. 289–296 (2001)

    Google Scholar 

  10. Lemire, D.: A better alternative to piecewise linear time series segmentation. In: SDM, pp. 545–550 (2007)

    Google Scholar 

  11. Li, F., Yi, K., Le, W.: Top- k queries on temporal data. VLDB J. 19(5), 715–733 (2010)

    Article  Google Scholar 

  12. Ma, H., Qian, W., Xia, F., He, X., Xu, J., Zhou, A.: Towards modeling popularity of microblogs. Front. Comput. Sci. 7(2), 171–184 (2013)

    Article  MathSciNet  Google Scholar 

  13. Wu, L., Lin, W., Xiao, X., Xu, Y.: LSII: an indexing structure for exact real-time search on microblogs. In: ICDE, pp. 482–493 (2013)

    Google Scholar 

  14. Zhuang, Y.: Building a complete Tweet index. Tuesday, 18 November 2014 (2014). https://blog.twitter.com/2014/building-a-complete-tweet-index. Accessed 21 Nov 2014

Download references

Acknowledgements

This work is partially supported by National High-tech R&D Program (863 Program) under grant number 2015AA015307 and National Science Foundation of China under grant number 61432006.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weining Qian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Xia, F., Yu, C., Qian, W., Zhou, A. (2016). Top-k Temporal Keyword Query over Social Media Data. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9931. Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45814-4_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45813-7

  • Online ISBN: 978-3-319-45814-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics