Advertisement

A Hybrid QA System with Focused IR and Automatic Summarization for INEX 2011

  • Pinaki Bhaskar
  • Somnath Banerjee
  • Snehasis Neogi
  • Sivaji Bandyopadhyay
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7424)

Abstract

The article presents the experiments carried out as part of the participation in the QA track of INEX 2011. We have submitted two runs. The INEX QA task has two main sub tasks, Focused IR and Automatic Summarization. In the Focused IR system, we first preprocess the Wikipedia documents and then index them using Nutch. Stop words are removed from each query tweet and all the remaining tweet words are stemmed using Porter stemmer. The stemmed tweet words form the query for retrieving the most relevant document using the index. The automatic summarization system takes as input the query tweet along with the tweet’s text and the title from the most relevant text document. Most relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query tweet, tweet’s text and title words. Each retrieved sentence is assigned a ranking score in the Automatic Summarization system. The answer passage includes the top ranked retrieved sentences with a limit of 500 words. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. Our first run got the highest score of 432.2 in Relaxed metric of Readability evaluation among all the participants.

Keywords

Information Retrieval Automatic Summarization Question Answering Information Extraction INEX 2011 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    SanJuan, E., Moriceau, V., Tannier, X., Bellot, P., Mothe, J.: Overview of the INEX 2011 Question Answering Track (QA@INEX). In: Geva, S., Kamps, J., Schenkel, R. (eds.) INEX 2011. LNCS, vol. 7424, pp. 188–206. Springer, Heidelberg (2012)Google Scholar
  2. 2.
    Jezek, K., Steinberger, J.: Automatic Text summarization. In: Snasel, V. (ed.) Znalosti 2008, pp. 1–12. FIIT STU Brarislava, UstavInformatiky a softveroveho inzinierstva (2008) ISBN 978-80-227-2827-0Google Scholar
  3. 3.
    Erkan, G., Radev, D.R.: LexRank: Graph-based Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)Google Scholar
  4. 4.
    Hahn, U., Romacker, M.: The SYNDIKATE text Knowledge base generator. In: The First International Conference on Human Language Technology Research. Association for Computational Linguistics, ACM, Morristown, NJ (2001)Google Scholar
  5. 5.
    Kyoomarsi, F., Khosravi, H., Eslami, E., Dehkordy, P.K.: Optimizing Text Summarization Based on Fuzzy Logic. In: Seventh IEEE/ACIS International Conference on Computer and Information Science, pp. 347–352. IEEE, University of ShahidBahonar Kerman, UK (2008)CrossRefGoogle Scholar
  6. 6.
    Bhaskar, P., Bandyopadhyay, S.: A Query Focused Multi Document Automatic Summarization. In: The 24th Pacific Asia Conference on Language, Information and Computation (PACLIC 24). Tohoku University, Sendai (2010)Google Scholar
  7. 7.
    Bhaskar, P., Bandyopadhyay, S.: A Query Focused Automatic Multi Document Summarizer. In: The International Conference on Natural Language Processing (ICON), pp. 241–250. IIT, Kharagpur (2010)Google Scholar
  8. 8.
    Rodrigo, A., Iglesias, J.P., Peñas, A., Garrido, G., Araujo, L.: A Question Answering System based on Information Retrieval and Validation. ResPubliQA (2010)Google Scholar
  9. 9.
    Schiffman, B., McKeown, K.R., Grishman, R., Allan, J.: Question Answering using Integrated Information Retrieval and Information Extraction. In: NAACL HLT, pp. 532–539 (2007)Google Scholar
  10. 10.
    Pakray, P., Bhaskar, P., Pal, S., Das, D., Bandyopadhyay, S., Gelbukh, A.: JU_CSE_TE: System Description QA@CLEF 2010 – ResPubliQA. In: Multiple Language Question Answering (MLQA 2010), CLEF 2010, Padua, Italy (2010)Google Scholar
  11. 11.
    Pakray, P., Bhaskar, P., Banerjee, S., Pal, B.C., Bandyopadhyay, S., Gelbukh, A.: A Hybrid Question Answering System based on Information Retrieval and Answer Validation. In: Question Answering for Machine Reading Evaluation (QA4MRE), CLEF 2011, Amsterdam (2011)Google Scholar
  12. 12.
    Tombros, A., Sanderson, M.: Advantages of Query Biased Summaries in Information Retrieval. In: SIGIR (1998)Google Scholar
  13. 13.
    Radev, D.R., Jing, H., Styś, M., Tam, D.: Centroid- based summarization of multiple documents. J. Information Processing and Management 40, 919–938 (2004)zbMATHCrossRefGoogle Scholar
  14. 14.
    Lin, C.Y., Hovy, E.H.: From Single to Multidocument Summarization: A Prototype System and its Evaluation. In: ACL, pp. 457–464 (2002)Google Scholar
  15. 15.
    Hardy, H., Shimizu, N., Strzalkowski, T., Ting, L., Wise, G.B., Zhang, X.: Cross-document summarization by concept classification. In: SIGIR, pp. 65–69 (2002)Google Scholar
  16. 16.
    Paladhi, S., Bandyopadhyay, S.: A Document Graph Based Query Focused Multi-Document Summarizer. In: The 2nd International Workshop on Cross Lingual Information Access (CLIA), pp. 55–62 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Pinaki Bhaskar
    • 1
  • Somnath Banerjee
    • 1
  • Snehasis Neogi
    • 1
  • Sivaji Bandyopadhyay
    • 1
  1. 1.Department of Computer Science and EngineeringJadavpur UniversityKolkataIndia

Personalised recommendations