Proximity-Based Reference Resolution to Improve Text Retrieval
Queries that contain named entities are very common especially in the blog retrieval. Current approaches for document retrieval are based on the frequency of query terms in documents. These methods may underestimate the query frequency due to the fact that named entities are usually referenced using anaphoric expressions. In this paper we focus on pronouns as anaphoric expressions and propose a method for finding query-entity types including female, male and non − person which helps to identify the proper set of pronouns that can refer to each query. We also propose a proximity-based method for estimating the frequency of the anaphoric expressions which are referring to a query-entity in a document. Experimental results on a standard blog collection show that the proposed method is effective and provides significant improvement over the term-frequency-based baseline.
KeywordsTerm Frequency Query Term Entity Frequency Coreference Resolution Opinion Retrieval
Unable to display preview. Download preview PDF.
- 1.Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Heidelberg (2007)Google Scholar
- 2.Gerani, S., Carman, M.J., Crestani, F.: Proximity-based opinion retrieval. In: Proc. of SIGIR 2010, pp. 403–410 (2010)Google Scholar
- 3.Macdonald, C., Ounis, I.: The TREC blogs06 collection: Creating and analysing a blog test collection. DCS Technical Report Series (2006), http://www.dcs.gla.ac.uk/~craigm/publications/macdonald06creating.pdf
- 4.Na, S.H., Ng, H.T.: A 2-poisson model for probabilistic coreference of named entities for improved text retrieval. In: Proc. of SIGIR 2009, pp. 275–282 (2009)Google Scholar
- 5.Nam, S.-H., Na, S.-H., Lee, Y., Lee, J.-H.: DiffPost: Filtering Non-relevant Content Based on Content Difference between Two Consecutive Blog Posts. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 791–795. Springer, Heidelberg (2009)CrossRefGoogle Scholar
- 6.Robertson, S., Walker, S., Jones, S., Hancock, M., Gatford, M.: Okapi at TREC-3. In: Overview of the 3rd Text REtrieval Conference (TREC 3), pp. 109–126 (1994)Google Scholar
- 9.Versley, Y., Ponzetto, S., Poesio, M., Eidelman, V., Jern, A., Smith, J., Yang, X., Moschitti, A.: Bart: A modular toolkit for coreference resolution. In: Proc. 6th Int. Conf. on Language Resources and Evaluation (LREC 2008), European Language Resources Association (ELRA), Marrakech (2008)Google Scholar
- 10.Zhang, W., Yu, C., Meng, W.: Opinion retrieval from blogs. In: Proc. of CIKM 2007, pp. 831–840 (2007)Google Scholar