Skip to main content

Retrieving Information from Microblog Using Pattern Mining and Relevance Feedback

  • Conference paper
Data and Knowledge Engineering (ICDKE 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7696))

Included in the following conference series:

Abstract

Retrieving information from Twitter is always challenging due to its large volume, inconsistent writing and noise. Most existing information retrieval (IR) and text mining methods focus on term-based approach, but suffers from the problems of terms variation such as polysemy and synonymy. This problem deteriorates when such methods are applied on Twitter due to the length limit. Over the years, people have held the hypothesis that pattern-based methods should perform better than term-based methods as it provides more context, but limited studies have been conducted to support such hypothesis especially in Twitter. This paper presents an innovative framework to address the issue of performing IR in microblog. The proposed framework discover patterns in tweets as higher level feature to assign weight for low-level features (i.e. terms) based on their distributions in higher level features. We present the experiment results based on TREC11 microblog dataset and shows that our proposed approach significantly outperforms term-based methods Okapi BM25, TF-IDF and pattern based methods, using precision, recall and F measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 72.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Efron, M., Winget, M.: Questions are content: A taxonomy of questions in a microblogging environment. Proceedings of the American Society for Information Science and Technology 47(1), 1–10 (2010)

    Article  Google Scholar 

  2. Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)

    Google Scholar 

  3. Boyd, D., Golder, S., Lotan, G.: Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In: Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, HICSS 2010, pp. 1–10. IEEE Computer Society Press, Washington, DC (2010)

    Chapter  Google Scholar 

  4. Asur, S., Huberman, B.A.: Predicting the future with social media. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 492–499. IEEE (2010)

    Google Scholar 

  5. Shamma, D.A., Kennedy, L., Churchill, E.F.: Tweet the debates: understanding community annotation of uncollected sources. In: WSM 2009: Proceedings of the First SIGMM Workshop on Social Media, pp. 3–10. ACM, New York (2009)

    Google Scholar 

  6. Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: WWW 2010: Proceedings of the 19th International Conference on World Wide Web, pp. 851–860. ACM, New York (2010)

    Google Scholar 

  7. Gaffney, D.: #iranelection: Quantifying online activism. In: Web Science Conference, Raleigh, NC, USA (April 2010)

    Google Scholar 

  8. Oh, O., Agrawal, M., Rao, H.: Information control and terrorism: Tracking the mumbai terrorist attack through twitter. In: Information Systems Frontiers, vol. 13, pp. 33–43 (2011), 10.1007/s10796-010-9275-8

    Google Scholar 

  9. Teevan, J., Ramage, D., Morris, M.R.: #twittersearch: a comparison of microblog search and web search. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 35–44. ACM (2011)

    Google Scholar 

  10. Efron, M.: Hashtag retrieval in a microblogging environment. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 787–788. ACM, New York (2010)

    Google Scholar 

  11. Naveed, N., Gottron, T., Kunegis, J., Alhadi, A.C.: Searching microblogs: coping with sparsity and document quality. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM 2011, pp. 183–188. ACM, New York (2011)

    Google Scholar 

  12. Wu, S.-T., Li, Y., Xu, Y.: Deploying approaches for pattern refinement in text mining. In: Proceedings of the Sixth International Conference on Data Mining, ICDM 2006, pp. 1157–1161. IEEE Computer Society, Washington, DC (2006)

    Google Scholar 

  13. Bernstein, M., Hong, L., Kairam, S., Chi, H., Suh, B.: A torrent of tweets: Managing information overload in online social streams. In: Workshop on Microblogging: What and How Can We Learn From It? (CHI 2010) (2010)

    Google Scholar 

  14. Efron, M.: Information search and retrieval in microblogs. Journal of the American Society for Information Science and Technology 62(6), 996–1008 (2011)

    Article  MathSciNet  Google Scholar 

  15. Massoudi, K., Tsagkias, M., de Rijke, M., Weerkamp, W.: Incorporating query expansion and quality indicators in searching microblog posts. In: Advances in Information Retrieval, pp. 362–367 (2011)

    Google Scholar 

  16. Abel, F., Celik, I., Houben, G.-J., Siehndel, P.: Leveraging the Semantics of Tweets for Adaptive Faceted Search on Twitter. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 1–17. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  17. Nagmoti, R., Teredesai, A., De Cock, M.: Ranking approaches for microblog search. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pp. 153–157. IEEE (2010)

    Google Scholar 

  18. Cui, H., Wen, J.-R., Nie, J.-Y., Ma, W.-Y.: Probabilistic query expansion using query logs. In: Proceedings of the 11th International Conference on World Wide Web, WWW 2002, pp. 325–332. ACM, New York (2002)

    Google Scholar 

  19. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, 1st edn. Cambridge University Press (July 2008)

    Google Scholar 

  20. Soboroff, I., Robertson, S.: Building a filtering test collection for trec 2002. In: 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR 2003, pp. 243–250. ACM, New York (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lau, C.H., Tao, X., Tjondronegoro, D., Li, Y. (2012). Retrieving Information from Microblog Using Pattern Mining and Relevance Feedback. In: Xiang, Y., Pathan, M., Tao, X., Wang, H. (eds) Data and Knowledge Engineering. ICDKE 2012. Lecture Notes in Computer Science, vol 7696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34679-8_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34679-8_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34678-1

  • Online ISBN: 978-3-642-34679-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics