Comments-Oriented Query Expansion for Opinion Retrieval in Blogs

  • Jose M. Chenlo
  • Javier Parapar
  • David E. Losada
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8109)


In recent years, Pseudo Relevance Feedback techniques have become one of the most effective query expansion approaches for document retrieval. Particularly, Relevance-Based Language Models have been applied in several domains as an effective and efficient way to enhance topic retrieval. Recently, some extensions to the original RM methods have been proposed to apply query expansion in other scenarios, such as opinion retrieval. Such approaches rely on mixture models that combine the query expansion provided by Relevance Models with opinionated terms obtained from external resources (e.g., opinion lexicons). However, these methods ignore the structural aspects of a document, which are valuable to extract topic-dependent opinion expressions. For instance, the sentiments conveyed in blogs are often located in specific parts of the blog posts and its comments. We argue here that the comments are a good guidance to find on-topic opinion terms that help to move the query towards burning aspects of the topic. We study the role of the different parts of a blog document to enhance blog opinion retrieval through query expansion. The proposed method does not require external resources or additional knowledge and our experiments show that this is a promising and simple way to make a more accurate ranking of blog posts in terms of their sentiment towards the query topic. Our approach compares well with other opinion finding methods, obtaining high precision performance without harming mean average precision.


Information retrieval opinion mining blogs comments relevance models pseudo relevance feedback query expansion 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Santos, R.L.T., Macdonald, C., McCreadie, R., Ounis, I., Soboroff, I.: Information retrieval on the blogosphere. Found. Trends Inf. Retr. 6(1), 1–125 (2012)CrossRefzbMATHGoogle Scholar
  2. 2.
    Ounis, I., Macdonald, C., Soboroff, I.: Overview of the TREC 2008 blog track. In: Proc. of the 17th Text Retrieval Conference, TREC 2008. NIST, Gaithersburg (2008)Google Scholar
  3. 3.
    Gerani, S., Carman, M.J., Crestani, F.: Proximity-based opinion retrieval. In: Proc. 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 403–410. ACM Press, New York (2010)Google Scholar
  4. 4.
    Santos, R.L.T., He, B., Macdonald, C., Ounis, I.: Integrating proximity to subjective sentences for blog opinion retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 325–336. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Huang, X., Croft, B.: A unified relevance model for opinion retrieval. In: Proc. of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 947–956. ACM, New York (2009)Google Scholar
  6. 6.
    Mishne, G., Glance, N.: Leave a reply: An analysis of weblog comments. In: Third Annual Workshop on the Weblogging Ecosystem (2006)Google Scholar
  7. 7.
    Lavrenko, V., Croft, W.B.: Relevance based language models. In: Proc. of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2001, pp. 120–127. ACM, New York (2001)CrossRefGoogle Scholar
  8. 8.
    Abdul-jaleel, N., Allan, J., Croft, W.B., Diaz, O., Larkey, L., Li, X., Smucker, M.D., Wade, C.: UMass at TREC 2004: Novelty and HARD. In: Proc. of TREC-13. NIST Special Publication, National Institute for Science and Technology (2004)Google Scholar
  9. 9.
    Lv, Y., Zhai, C.: A comparative study of methods for estimating query language models with pseudo feedback. In: Proc. of the 18th ACM Conf. on Information and Knowledge Management, CIKM 2009, pp. 1895–1898. ACM, New York (2009)Google Scholar
  10. 10.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), 1–135 (2007)Google Scholar
  11. 11.
    Macdonald, C., Ounis, I.: The TREC Blogs 2006 collection: Creating and analysing a blog test collection. Technical Report TR-2006-224, Department of Computing Science, University of Glasgow (2006)Google Scholar
  12. 12.
    Parapar, J., López-castro, J., Barreiro, A.: Blog posts and comments extraction and impact on retrieval effectiveness. In: 1st Spanish Conference on Information Retrieval, CERI 2012, Madrid, pp. 5–16 (2010)Google Scholar
  13. 13.
    Sakai, T., Manabe, T., Koyama, M.: Flexible pseudo-relevance feedback via selective sampling. ACM Transactions on Asian Language Information Processing (TALIP) 4(2), 111–135 (2005)CrossRefGoogle Scholar
  14. 14.
    Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)CrossRefGoogle Scholar
  15. 15.
    Rocchio, J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall, Inc. (1971)Google Scholar
  16. 16.
    Croft, B., Harper, D.J.: Using Probabilistic Models of Document Retrieval without Relevance Information. Journal of Documentation 35, 285–295 (1979)CrossRefGoogle Scholar
  17. 17.
    Ruthven, I., Lalmas, M.: A survey on the use of relevance feedback for information access systems. Knowl. Eng. Rev. 18(2), 95–145 (2003)CrossRefGoogle Scholar
  18. 18.
    Lu, X.A., Ayoub, M., Dong, J.: Ad Hoc Experiments Using EUREKA. In: Proc. of TREC-5, pp. 229–240. NIST Special Publication, National Institute for Science and Technology (1996)Google Scholar
  19. 19.
    Weerkamp, W., de Rijke, M.: Credibility improves topical blog post retrieval. In: Proc. of ACL 2008: HLT, pp. 923–931. Association for Computational Linguistics, Columbus (2008)Google Scholar
  20. 20.
    Mishne, G.: Using blog properties to improve retrieval. In: International Conference on Weblogs and Social Media 2007 (2007) (retrieved February 29, 2008)Google Scholar
  21. 21.
    Hu, M., Sun, A., Lim, E.P.: Comments-oriented blog summarization by sentence extraction. In: Proc. of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, pp. 901–904. ACM, New York (2007)CrossRefGoogle Scholar
  22. 22.
    Parapar, J., López-Castro, J., Barreiro, A.: Blog snippets: a comments-biased approach. In: Proc. of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 711–712. ACM, New York (2010)Google Scholar
  23. 23.
    Gerani, S., Keikha, M., Crestani, F.: Aggregating multiple opinion evidence in proximity-based opinion retrieval. In: SIGIR, pp. 1199–1200 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Jose M. Chenlo
    • 1
  • Javier Parapar
    • 2
  • David E. Losada
    • 1
  1. 1.Centro de Investigación en Tecnoloxías da Información (CITIUS)Universidad de Santiago de CompostelaSpain
  2. 2.IRLab, Computer Science DepartmentUniversity of A CoruñaSpain

Personalised recommendations