Skip to main content

Tweet Timeline Generation via Graph-Based Dynamic Greedy Clustering

  • Conference paper
  • First Online:
Information Retrieval Technology (AIRS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9460))

Included in the following conference series:

Abstract

When searching a query in the microblogging, a user would typically receive an archive of tweets as part of a retrospective piece on the impact of social media. For ease of understanding the retrieved tweets, it is useful to produce a summarized timeline about a given topic. However, tweet timeline generation is quite challenging due to the noisy and temporal characteristics of microblogs. In this paper, we propose a graph-based dynamic greedy clustering approach, which considers the coverage, relevance and novelty of the tweet timeline. First, tweet embedding representation is learned in order to construct the tweet semantic graph. Based on the graph, we estimate the coverage of timeline according to the graph connectivity. Furthermore, we integrate a noise tweet elimination component to remove noisy tweets with the lexical and semantic features based on relevance and novelty. Experimental results on public Text Retrieval Conference (TREC) Twitter corpora demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/lintool/twitter-tools.

References

  1. Agarwal, M.K., Ramamritham, K., Bhide, M.: Real time discovery of dense clusters in highly dynamic graphs: identifying real world events in highly dynamic environments. Proc. VLDB Endowment 5(10), 980–991 (2012)

    Article  Google Scholar 

  2. Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 5–14. ACM (2009)

    Google Scholar 

  3. Albakour, M., Macdonald, C., Ounis, I., et al.: On sparsity and drift for effective real-time filtering in microblogs. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 419–428. ACM (2013)

    Google Scholar 

  4. Aslam, J.A., Pelekhov, E., Rus, D.: The star clustering algorithm for static and dynamic information organization. J. Graph Algorithms Appl. 8, 95–129 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  5. Di Marco, A., Navigli, R.: Clustering and diversifying web search results with graph-based word sense induction. Comput. Linguistics 39(3), 709–754 (2013)

    Article  Google Scholar 

  6. Joachims, T.: Optimizing search engines using clickthrough data. In: KDD, pp. 133–142 (2002)

    Google Scholar 

  7. Lappas, T., Arai, B., Platakis, M., Kotsakos, D., Gunopulos, D.: On burstiness-aware search for document sequences. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 477–486. ACM (2009)

    Google Scholar 

  8. Lee, P., Lakshmanan, L.V., Milios, E.E.: Incremental cluster evolution tracking from highly dynamic network data. In: IEEE 30th International Conference on Data Engineering (ICDE), 2014, pp. 3–14. IEEE (2014)

    Google Scholar 

  9. Lin, C., Lin, C., Li, J., Wang, D., Chen, Y., Li, T.: Generating event storylines from microblogs. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 175–184. ACM (2012)

    Google Scholar 

  10. Lin, J., Efron, M.: Overview of the TREC-2013 Microblog Track. In: TREC 2013 (2013)

    Google Scholar 

  11. Lin, J., Efron, M.: Overview of the TREC-2014 Microblog Track. In: TREC 2014 (2014)

    Google Scholar 

  12. Lv, C., Fan, F., Qiang, R., Fei, Y., Yang, J.: PKUICST at TREC 2014 Microblog Track: Feature Extraction for Effective Microblog Search and Adaptive Clustering Algorithms for TTG (2014)

    Google Scholar 

  13. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  14. Ounis, I., Macdonald, C., Lin, J., Soboroff, I.: Overview of the TREC-2011 Microblog Track. In: TREC 2011 (2012)

    Google Scholar 

  15. Walid, M., Wei, G., Tarek, E.: QCRI at TREC 2014: Applying the KISS Principle for TTG Task in the Microblog Track (2014)

    Google Scholar 

  16. Wang, D., Li, T., Ogihara, M.: Generating pictorial storylines via minimum-weight connected dominating set approximation in multi-view graphs. In: AAAI (2012)

    Google Scholar 

  17. Wang, X., Zhai, C.: Learn from web search logs to organize search results. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 87–94. ACM (2007)

    Google Scholar 

  18. Xu, T., McNamee, P., Oard, D.W.: HLTCOE at TREC 2014: Microblog and Clinical Decision Support (2014)

    Google Scholar 

  19. Zhai, C., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 10–17. ACM (2003)

    Google Scholar 

  20. Zhang, Y.: Using bayesian priors to combine classifiers for adaptive filtering. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 345–352. ACM (2004)

    Google Scholar 

  21. Zhou, W., Shen, C., Li, T., Chen, S., Xie, N., Wei, J.: Generating textual storyline to improve situation awareness in disaster management. In. In Proceedings of the 15th IEEE International Conference on Information Reuse and Integration (IRI 2014) (2014)

    Google Scholar 

Download references

Acknowledgments

The work reported in this paper is supported by the National Natural Science Foundation of China Grant 61370116. We thank anonymous reviewers for their beneficial comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianwu Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Fan, F., Qiang, R., Lv, C., Zhao, W.X., Yang, J. (2015). Tweet Timeline Generation via Graph-Based Dynamic Greedy Clustering. In: Zuccon, G., Geva, S., Joho, H., Scholer, F., Sun, A., Zhang, P. (eds) Information Retrieval Technology. AIRS 2015. Lecture Notes in Computer Science(), vol 9460. Springer, Cham. https://doi.org/10.1007/978-3-319-28940-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28940-3_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28939-7

  • Online ISBN: 978-3-319-28940-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics