20th Century Esfinge (Sphinx) Solving the Riddles at CLEF 2005

  • Luís Costa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4022)


Esfinge is a general domain Portuguese question answering system. It tries to take advantage of the steadily growing and constantly updated information freely available in the World Wide Web in its question answering tasks. The system participated last year for the first time in the monolingual QA track. However, the results were compromised by several basic errors, which were corrected shortly after. This year, Esfinge participation was expected to yield better results and allow experimentation with a Named Entity Recognition System, as well as try a multilingual QA track for the first time. This paper describes how the system works, presents the results obtained by the official runs in considerable detail, as well as results of experiments measuring the import of different parts of the system, by reporting the decrease in performance when the system is executed without some of its components/features.


Machine Translation Document Collection Search Pattern Name Entity Recognition Entity Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Brill, E.: Processing Natural Language without Natural Language Processing. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 360–369. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Costa, L.: First Evaluation of Esfinge – A Question Answering System for Portuguese. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 522–533. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Christ, O., Schulze, B.M., Hofmann, A., Koenig, E.: The IMS Corpus Workbench: Corpus Query Processor (CQP): User’s Manual. University of Stuttgart, March 8 (CQP V2.2) (1999)Google Scholar
  5. 5.
    Brill, E., Lin, J., Banko, M., Dumais, S., Ng, A.: Data-Intensive Question Answering. In: Voorhees, E.M., Harman, D.K. (eds.) Information Technology: The Tenth Text Retrieval Conference, TREC 2001. NIST Special Publication 500-250, pp. 393–400 (2001)Google Scholar
  6. 6.
    Aires, R., Aluísio, S., Santos, D.: User-aware page classification in a search engine. In: Proceedings of Stylistic Analysis of Text for Information Access, SIGIR 2005 Workshop, Salvador, Bahia, Brasil, 19 August (2005)Google Scholar
  7. 7.
    Santos, D., Rocha, P.: Evaluating CETEMPúblico, a free resource for Portuguese. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, 9-11 July, pp. 442–449 (2001)Google Scholar
  8. 8.
    Simões, A.M., Almeida, J.J.: Jspell.pm - um módulo de análise morfológica para uso em Processamento de Linguagem Natural. In: Gonçalves, A., Correia, C.N. (eds.) Actas do XVII Encontro da Associação Portuguesa de Linguística (APL 2001), Lisboa, 2-4 October, pp. 485–495. APL Lisboa (2002)Google Scholar
  9. 9.
    Orengo, V.M., Huyck, C.: A Stemming algorithm for the Portuguese Language. In: 8th International Symposium on String Processing and Information Retrieval (SPIRE 2001), Laguna de San Rafael, Chile, November 13-15, pp. 183–193. IEEE Computer Society Publications, Los Alamitos (2001)Google Scholar
  10. 10.
    Banerjee, S., Pedersen, T.: The Design, Implementation, and Use of the Ngram Statistic Package. In: Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, February 2003, pp. 370–381 (2003)Google Scholar
  11. 11.
    Santos, D., Seco, N., Cardoso, N., Vilela, R.: HAREM: An Advanced NER Evaluation Context for Portuguese. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italia, May 22-28 (2006)Google Scholar
  12. 12.
    Sarmento, L.: SIEMÊS - A Named-Entity Recognizer for Portuguese Relying on Similarity Rules. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds.) PROPOR 2006. LNCS (LNAI), vol. 3960, pp. 90–99. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Santos, D., Cardoso, N.: Portuguese at CLEF 2005. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 1007–1010. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Rayson, P., Garside, R.: The CLAWS Web Tagger. ICAME Journal, no. 22. The HIT-centre - Norwegian Computing Centre for the Humanities, Bergen, pp. 121–123 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Luís Costa
    • 1
  1. 1.Linguateca at SINTEF ICTOsloNorway

Personalised recommendations