Skip to main content

Improving IdSay: A Characterization of Strengths and Weaknesses in Question Answering Systems for Portuguese

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6001))

  • 607 Accesses

Abstract

IdSay is a Question Answering system for Portuguese that participated at QA@CLEF 2008 with a baseline version (IdSayBL). Despite the encouraging results, there was still much room for improvement. The participation of six systems in the Portuguese task, with very good results either individually or in an hypothetical combination run, provided a valuable source of information. We made an analysis of all the answers submitted by all systems to identify their strengths and weaknesses. We used the conclusions of that analysis to guide our improvements, keeping in mind the two key characteristics we want for the system: efficiency in terms of response time and robustness to treat different types of data. As a result, an improved version of IdSay was developed, including as the most important enhancement the introduction of semantic information. We obtained significantly better results, from an accuracy in the first answer of 32.5% in IdSayBL to 50.5% in IdSay, without degradation of response time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carvalho, G., de Matos, D.M., Rocio, V.: IdSay: Question Answering for Portuguese, pp. 345–352. Springer, Heidelberg (2009)

    Google Scholar 

  2. Forner, P., Peñas, A., Agirre, E., Alegria, I., Moreau, N., Osenova, P., Prokopidis, P., Rocha, P.: Overview of the Clef 2008 Multilingual Question Answering Track, pp. 262–295. Springer, Heidelberg (2009)

    Google Scholar 

  3. Amaral, C., Cassan, A., Figueira, H., Martins, A., Mendes, A., Mendes, P., Pina, J., Pinto, C.: Priberamas Question Answering System in QA@CLEF 2008, pp. 337–344. Springer, Heidelberg (2009)

    Google Scholar 

  4. Saias, J., Quaresma, P.: The senso question answering system at qa@clef 2008. In: Working Notes of the CLEF 2008 Workshop (2008)

    Google Scholar 

  5. Costa, L.F.: Using Answer Retrieval Patterns to Answer Portuguese Questions, pp. 361–368. Springer, Heidelberg (2009)

    Google Scholar 

  6. Coheur, L., Mendes, A., Guimarães, J., Mamede, N.J., Ribeiro, R.: Question Interpretation in QA@L2F, pp. 377–384. Springer, Heidelberg (2009)

    Google Scholar 

  7. Sarmento, L., Teixeira, J., Oliveira, E.: Assessing the Impact of Thesaurus-Based Expansion Techniques in QA-Centric IR, pp. 325–332. Springer, Heidelberg (2009)

    Google Scholar 

  8. Dias-da-Silva, B.C., Oliveira, M.F., Moraes, H.R., Hasegawa, R., Amorim, D., Paschoalino, C., Nascimento, A.C.A.: Construção de um thesaurus eletrônico para o português do brasil. In: V Encontro para o Processamento computacional da Língua Portuguesa Escrita e Falada, Atibaia, Brazil, vol. 4, pp. 1–10 (2000)

    Google Scholar 

  9. Maziero, E.G., Pardo, T.A., Felippo, A.D., Dias-da-Silva, B.C.: A base de dados lexical e a interface web do tep 2.0 - thesaurus electrônico para o português do brasil. In: VI Workshop em Tecnologia da Informação e da Linguagem Humana (TIL), pp. 390–392 (2008)

    Google Scholar 

  10. Amaral, C., Laurent, D., Martins, A., Mendes, A., Pinto, C.: Design and implementation of a semantic search engine for portuguese. In: ELRA (ed.) Proc. of the 4th Language Resources and Evaluation Conference – LREC 2004, Lisboa, Portugal, pp. 247–250 (2004)

    Google Scholar 

  11. Saias, J., Quaresma, P.: The University of Évora’s Participation in QA@CLEF-2007, pp. 316–323. Springer, Heidelberg (2008)

    Google Scholar 

  12. Yu, J., Thom, J.A., Tam, A.: Ontology evaluation using wikipedia categories for browsing. In: CIKM 2007: Proc. of the 16th ACM Conf. on Information and Knowledge Management, pp. 223–232. ACM, New York (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Carvalho, G., de Matos, D.M., Rocio, V. (2010). Improving IdSay: A Characterization of Strengths and Weaknesses in Question Answering Systems for Portuguese. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds) Computational Processing of the Portuguese Language. PROPOR 2010. Lecture Notes in Computer Science(), vol 6001. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12320-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12320-7_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12319-1

  • Online ISBN: 978-3-642-12320-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics