IRIT at INEX: Question Answering Task

  • Liana Ermakova
  • Josiane Mothe
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7424)


In this paper we describe an approach for tweet contextualization developed in the context of the INEX question answering track. The task is to provide a context up to 500 words to a tweet. The summary should be an extract from the Wikipedia. Our approach is based on the index which includes not only lemmas, but also named entities (NE). Sentence retrieval is based on standard TF-IDF measure enriched by named entity recognition, part-of-speech (POS) weighting and smoothing from local context. The method has been ranked first in the INEX QA track according to content evaluation.


Information retrieval summarization extraction contextual information smoothing part of speech tagging named entity 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    SanJuan, E., Moriceau, V., Tannier, X., Bellot, P., Mothe, J.: Overview of the INEX 2011 Question Answering Track (QA@INEX). In: Geva, S., Kamps, J., Schenkel, R. (eds.) INEX 2011. LNCS, vol. 7424, pp. 188–206. Springer, Heidelberg (2012)Google Scholar
  2. 2.
    Meij, E., Weerkamp, W., Rijke, M.: Adding Semantics to Microblog Posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (2012)Google Scholar
  3. 3.
    Vivaldi, J., Cunha, I., Ramırez, J.: The REG summarization system at QA@INEX track 2010 (2010)Google Scholar
  4. 4.
    Luhn, H.: The automatic creation of literature abstracts. IBM Journal of Research and Development, 159–165 (April 1958)Google Scholar
  5. 5.
    Seki, Y.: Automatic Summarization Focusing on Document Genre and Text Structure. ACM SIGIR Forum 39(1), 65–67 (2005)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Erkan, G., Radev, D.: LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)Google Scholar
  7. 7.
    Soriano-Morales, E.-P., Medina-Urrea, A., Sierra, G., Mendez-Cruz, C.-F.: The GIL-UNAM-3 summarizer: an experiment in the track QA@INEX 2010 (2010)Google Scholar
  8. 8.
    Torres-Moreno, J.-M., Gagnon, M.: The Cortex Automatic Summarization System at the QA@INEX Track 2010. In: Geva, S., Kamps, J., Schenkel, R., Trotman, A. (eds.) INEX 2010. LNCS, vol. 6932, pp. 290–294. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Cabrera-Diego, L., Molina, A., Sierra, G.: A Dynamic Indexing Summarizer at the QA@INEX 2011 track. In: INEX 2011 Workshop Pre-Proceedings, pp. 154–159 (2011)Google Scholar
  10. 10.
    Linhares, A., Velazquez, P.: Using Textual Energy (Enertex) at QA@INEX track 2010 (2010)Google Scholar
  11. 11.
    Torres-Moreno, J.-M., Velazquez-Morales, P., Gagnon, M.: The Cortex and Enertex summarization systems at the QA@INEX track 2011, pp. 196–205 (2011)Google Scholar
  12. 12.
    Lin, C.-Y., Hovy, E.: Identifying Topics by Position. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, pp. 283–290 (1997)Google Scholar
  13. 13.
    Lin, C.-Y.: Assembly of Topic Extraction Modules in SUMMARIST. In: AAAI Spring Symposium on Intelligent Text Summarisation (1998)Google Scholar
  14. 14.
    Barzilay, R., McKeown, K., Elhadad, M.: Information fusion in the context of multi-document summarization. In: ACL 1999 Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 550–557 (1999)Google Scholar
  15. 15.
    Porter, M.: An algorithm for suffix stripping. In: Readings in Information Retrieval. Morgan Kaufmann Publishers Inc., San Francisco (1997)Google Scholar
  16. 16.
    Ponte, J., Croft, W.: A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1998)Google Scholar
  17. 17.
    Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)Google Scholar
  18. 18.
    Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19(2) (1993)Google Scholar
  19. 19.
    Murdock, V.: Aspects of Sentence Retrieval. Dissertation (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Liana Ermakova
    • 1
  • Josiane Mothe
    • 1
  1. 1.Institut de Recherche en Informatique de ToulouseToulouse Cedex 9France

Personalised recommendations