Skip to main content

A reranking-based tweet retrieval approach for planned events

Abstract

Twitter provides access to latest information. Whenever a major event happens, people try to search for event related information in social media platforms like Twitter. So, it is essential to develop methods to get good quality of event related tweets. People share different opinions, feelings, feedback, etc. about events happening around the world in Twitter in the form of tweets. These tweets are often short and contain noise. So, it is very difficult to get the most relevant data for a given event from Twitter. We propose a two-phase approach to retrieve the tweets related to planned events. In the first phase, initial retrieval is done by using BM25 algorithm. In the second phase, reranking is done by combining three scoring mechanisms namely BM25 score, top hashtags score related to an event, and top TF-IDF terms score related to an event. A learning to rank algorithm SVM_Rank is applied to give weights to these three methods and combine them to get the final score of the tweet. We performed experiments on two benchmark datasets CLEF and TREC. Experimental results show that our method outperforms baseline and literature methods for both the datasets according to multiple evaluation metrics.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Notes

  1. 1.

    https://www.eventbrite.com/

  2. 2.

    http://www.eventful.com

  3. 3.

    http://www.last.fm/

  4. 4.

    The authors use two additional services such as Microsoft Web n-gram Service and Yahoo! Term Extraction Web Service to get important terms. We didn’t use these two services as they are currently not available/deprecated.

References

  1. 1.

    Albishre, K., Li, Y., Xu, Y.: Effective pseudo-relevance for microblog retrieval. In: Proceedings of the Australasian Computer Science Week Multiconference, p. 51. ACM (2017)

  2. 2.

    Becker, H., Iter, D., Naaman, M., Gravano, L.: Identifying content for planned events across social media sites. In: Proceedings of the fifth ACM international conference on Web search and data mining, pp. 533–542. ACM (2012)

  3. 3.

    Carterette, B., Voorhees, E.M.: Overview of Information Retrieval Evaluation. In: Current Challenges in Patent Information Retrieval, pp. 69–85. Springer (2011)

  4. 4.

    Chy, A.N., Ullah, M.Z., Aono, M.: Query expansion for microblog retrieval focusing on an ensemble of features. J. Inf. Process. 27, 61–76 (2019)

    Google Scholar 

  5. 5.

    Cotelo, J.M., Cruz, F.L., Troyano, J.A.: Dynamic topic-related tweet retrieval. J. Assoc. Inf. Sci. Technol. 65(3), 513–523 (2014)

    Article  Google Scholar 

  6. 6.

    Cui, A., Zhang, M., Liu, Y., Ma, S., Zhang, K.: Discover breaking events with popular hashtags in twitter. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp. 1794–1798. ACM (2012)

  7. 7.

    Ermakova, L., Goeuriot, L., Mothe, J., Mulhem, P., Nie, J.Y., SanJuan, E.: Clef 2017 Microblog Cultural Contextualization Lab Overview. In: International Conference of the Cross-Language Evaluation Forum for European Languages, pp. 304–314. Springer (2017)

  8. 8.

    Gupta, A., Kumaraguru, P.: Credibility ranking of tweets during high impact events. In: Proceedings of the 1st workshop on privacy and security in online social media, p. 2. Acm (2012)

  9. 9.

    Han, Z., Yang, M., Kong, L., Qi, H., Li, S.: A hybrid model for microblog real-time filtering. Chin. J. Electron. 25(3), 432–440 (2016)

    Article  Google Scholar 

  10. 10.

    Joachims, T.: Training linear svms in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 217–226. ACM (2006)

  11. 11.

    Jones, K.S., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments: Part 2. Inf. Process. Manag. 36(6), 809–840 (2000)

    Article  Google Scholar 

  12. 12.

    Krestel, R., Werkmeister, T., Wiradarma, T.P., Kasneci, G.: Tweet-recommender: Finding relevant tweets for news articles. In: Proceedings of the 24th International Conference on World Wide Web, pp. 53–54. ACM (2015)

  13. 13.

    Kywe, S.M., Hoang, T.A., Lim, E.P., Zhu, F.: On Recommending Hashtags in Twitter Networks. In: International Conference on Social Informatics, pp. 337–350. Springer (2012)

  14. 14.

    Madisetty, S., Desarkar, M.S.: Exploiting meta attributes for identifying event related hashtags. In: Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, pp. 238–245. INSTICC, SciTePress. https://doi.org/10.5220/0006502602380245 (2017)

  15. 15.

    Madisetty, S., Desarkar, M.S.: Identification of Relevant Hashtags for Planned Events Using Learning to Rank. In: International Joint Conference on Knowledge Discovery, Knowledge Engineering, and Knowledge Management, pp. 82–99. Springer (2017)

  16. 16.

    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge University Press (2008)

  17. 17.

    McCreadie, R., Macdonald, C.: Relevance in microblogs: Enhancing tweet retrieval using hyperlinked documents. In: Proceedings of the 10th conference on open research areas in information retrieval, pp. 189–196. Le Centre de Hautes Etudes Internationales D’informatique Documentaire (2013)

  18. 18.

    Mulhem, P., Goeuriot, L., Dogra, N., Amer, N.O.: Timeline Illustration Based on Microblogs: When Diversification Meets Metadata Re-Ranking. In: International Conference of the Cross-Language Evaluation Forum for European Languages, pp. 224–235. Springer (2017)

  19. 19.

    Noro, T., Tokuda, T.: Searching for relevant tweets based on topic-related user activities. J. Web Eng. 15(3&4), 249–276 (2016)

    Google Scholar 

  20. 20.

    Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)

  21. 21.

    Ravikumar, S., Balakrishnan, R., Kambhampati, S.: Ranking tweets considering trust and relevance. In: Proceedings of the Ninth International Workshop on Information Integration on the Web, pp. 4. ACM (2012)

  22. 22.

    Sadri, M., Mehrotra, S., Yu, Y.: Online adaptive topic focused tweet acquisition. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 2353–2358. ACM (2016)

  23. 23.

    Snyder, L.S., Lin, Y.S., Karimzadeh, M., Goldwasser, D., Ebert, D.S.: Interactive learning for identifying relevant tweets to support real-time situational awareness. IEEE Trans. Vis. Comput. Graph. 26(1), 558–568 (2019)

    Google Scholar 

  24. 24.

    Soboroff, I., Ounis, I., Macdonald, C., Lin, J.J.: Overview of the Trec-2012 Microblog Track. In: TREC, vol. 2012, pp. 20. Citeseer (2012)

  25. 25.

    Suarez, A., Albakour, D., Corney, D., Martinez, M., Esquivel, J.: A Data Collection for Evaluating the Retrieval of Related Tweets to News Articles. In: European Conference on Information Retrieval, pp. 780–786. Springer (2018)

  26. 26.

    Wang, S., Chen, Z., Liu, B., Emery, S.: Identifying Search Keywords for Finding Relevant Social Media Posts. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)

  27. 27.

    Wang, Y., Huang, H., Feng, C.: Query expansion with local conceptual word embeddings in microblog retrieval. IEEE Transactions on Knowledge and Data Engineering (2019)

  28. 28.

    Zheng, X., Sun, A., Wang, S., Han, J.: Semi-supervised event-related tweet identification with dynamic keyword generation. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1619–1628. ACM (2017)

  29. 29.

    Zhu, X., Huang, J., Zhou, B., Li, A., Jia, Y.: Real-time personalized twitter search based on semantic expansion and quality model. Neurocomputing 254, 13–21 (2017)

    Article  Google Scholar 

  30. 30.

    Zingla, M.A., Latiri, C., Mulhem, P., Berrut, C., Slimani, Y.: Hybrid query expansion model for text and microblog information retrieval. Inf. Retr. J. 21(4), 337–367 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

This paper is an outcome of the research and development work done under the Visvesvaraya PhD Scheme of Ministry of Electronics and Information Technology, Government of India.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Sreekanth Madisetty.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Madisetty, S., Desarkar, M.S. A reranking-based tweet retrieval approach for planned events. World Wide Web (2021). https://doi.org/10.1007/s11280-021-00962-8

Download citation

Keywords

  • Tweet retrieval
  • Twitter
  • Planned events
  • Microblog retrieval
  • Social media