20th Century Esfinge (Sphinx) Solving the Riddles at CLEF 2005
Esfinge is a general domain Portuguese question answering system. It tries to take advantage of the steadily growing and constantly updated information freely available in the World Wide Web in its question answering tasks. The system participated last year for the first time in the monolingual QA track. However, the results were compromised by several basic errors, which were corrected shortly after. This year, Esfinge participation was expected to yield better results and allow experimentation with a Named Entity Recognition System, as well as try a multilingual QA track for the first time. This paper describes how the system works, presents the results obtained by the official runs in considerable detail, as well as results of experiments measuring the import of different parts of the system, by reporting the decrease in performance when the system is executed without some of its components/features.
KeywordsMachine Translation Document Collection Search Pattern Name Entity Recognition Entity Recognition
Unable to display preview. Download preview PDF.
- 1.Wikipedia: http://en.wikipedia.org/wiki/Sphinx/
- 4.Christ, O., Schulze, B.M., Hofmann, A., Koenig, E.: The IMS Corpus Workbench: Corpus Query Processor (CQP): User’s Manual. University of Stuttgart, March 8 (CQP V2.2) (1999)Google Scholar
- 5.Brill, E., Lin, J., Banko, M., Dumais, S., Ng, A.: Data-Intensive Question Answering. In: Voorhees, E.M., Harman, D.K. (eds.) Information Technology: The Tenth Text Retrieval Conference, TREC 2001. NIST Special Publication 500-250, pp. 393–400 (2001)Google Scholar
- 6.Aires, R., Aluísio, S., Santos, D.: User-aware page classification in a search engine. In: Proceedings of Stylistic Analysis of Text for Information Access, SIGIR 2005 Workshop, Salvador, Bahia, Brasil, 19 August (2005)Google Scholar
- 7.Santos, D., Rocha, P.: Evaluating CETEMPúblico, a free resource for Portuguese. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, 9-11 July, pp. 442–449 (2001)Google Scholar
- 8.Simões, A.M., Almeida, J.J.: Jspell.pm - um módulo de análise morfológica para uso em Processamento de Linguagem Natural. In: Gonçalves, A., Correia, C.N. (eds.) Actas do XVII Encontro da Associação Portuguesa de Linguística (APL 2001), Lisboa, 2-4 October, pp. 485–495. APL Lisboa (2002)Google Scholar
- 9.Orengo, V.M., Huyck, C.: A Stemming algorithm for the Portuguese Language. In: 8th International Symposium on String Processing and Information Retrieval (SPIRE 2001), Laguna de San Rafael, Chile, November 13-15, pp. 183–193. IEEE Computer Society Publications, Los Alamitos (2001)Google Scholar
- 10.Banerjee, S., Pedersen, T.: The Design, Implementation, and Use of the Ngram Statistic Package. In: Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, February 2003, pp. 370–381 (2003)Google Scholar
- 11.Santos, D., Seco, N., Cardoso, N., Vilela, R.: HAREM: An Advanced NER Evaluation Context for Portuguese. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italia, May 22-28 (2006)Google Scholar
- 14.Rayson, P., Garside, R.: The CLAWS Web Tagger. ICAME Journal, no. 22. The HIT-centre - Norwegian Computing Centre for the Humanities, Bergen, pp. 121–123 (1998)Google Scholar