Skip to main content

Analysing the Role of Representation Choices in Portuguese Relation Extraction

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9283))

Abstract

Relation Extraction is the task of identifying and classifying the semantic relations between entities in text. This task is one of the main challenges in Natural Language Processing. In this work, the relation extraction task is treated as sequence labelling problem. We analysed the impact of different representation schemes for the relation descriptors. In particular, we analysed the BIO and IO schemes performance considering a Conditional Random Fields classifier for the extraction of any relation descriptor occurring between named entities in the Organisation domain (Person, Organisation, Place). Overall, the classifier proposed here presents the best results using the IO notation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abreu, S.C., Bonamigo, T.L., Vieira, R.: A review on relation extraction with an eye on portuguese. Journal of the Brazilian Computer Society, 1–19 (2013)

    Google Scholar 

  2. Banko, M., Etzioni, O.: The tradeoffs between open and traditional relation extraction. In: McKeown, K., Moore, J.D., Teufel, S., Allan, J., Furui, S. (eds.) ACL, pp. 28–36. The Association for Computer Linguistics (2008)

    Google Scholar 

  3. Batista, D.S., Forte, D., Silva, R., Martins, B., Silva, M.: Extracção de relações semânticas de textos em português explorando a DBpédia e a Wikipédia. Linguamatica 5(1), 41–57 (2013)

    Google Scholar 

  4. Bellare, K., Mccallum, A.: Learning extractors from unlabeled text using relevant databases. In: Sixth International Workshop on Information Integration on the Web (IIWeb) (2007)

    Google Scholar 

  5. Bick, E.: The Parsing System Palavras. Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. University of Arhus (2000)

    Google Scholar 

  6. Brucksen, M., Souza, J.G.C., Vieira, R., Rigo, S.: Sistema serelep para o reconhecimento de relações entre entidades mencionadas. In: Mota, C., Santos, D. (eds.) Segundo HAREM, chap. 14, pp. 247–260. Linguateca (2008)

    Google Scholar 

  7. Cardoso, N.: Rembrandt - reconhecimento de entidades mencionadas baseado em relações e análise detalhada do texto. In: Mota, C., Santos, D. (eds.) Segundo HAREM, chap. 11, pp. 195–211. Linguateca (2008)

    Google Scholar 

  8. Chaves, M.S.: Geo-ontologias e padrões para reconhecimento de locais e de suas relações em textos: o sei-geo no segundo harem. In: Mota, C., Santos, D. (eds.) Segundo HAREM, chap. 13, pp. 231–245. Linguateca (2008)

    Google Scholar 

  9. Chen, Y., Zheng, Q., Wang, W., Chen, Y.: Knowledge element relation extraction using conditional random fields. In: CSCWD, pp. 245–250 (2010)

    Google Scholar 

  10. Collovini, S., Pugens, L., Vanin, A.A., Vieira, R.: Extraction of relation descriptors for Portuguese using conditional random fields. In: Bazzan, A.L.C., Pichara, K. (eds.) IBERAMIA 2014. LNCS, vol. 8864, pp. 108–119. Springer, Heidelberg (2014)

    Google Scholar 

  11. Culotta, A., McCallum, A., Betz, J.: Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In: Proceedings of the Main Conference on HLT-NAACL, HLT-NAACL ’06, pp. 296–303. Association for Computational Linguistics, Stroudsburg (2006)

    Google Scholar 

  12. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL ’05, pp. 363–370. Association for Computational Linguistics, Stroudsburg (2005)

    Google Scholar 

  13. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Prentice Hall Series in Artificial Intelligence, 2nd edn. Pearson Education Ltd., London (2009)

    Google Scholar 

  14. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)

    Google Scholar 

  15. Li, Y., Jiang, J., Chieu, H.L., Chai, K.M.A.: Extracting relation descriptors with conditional random fields. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 392–400. Asian Federation of Natural Language Processing, Chiang Mai (2011)

    Google Scholar 

  16. McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, CONLL ’03, vol. 4, pp. 188–191. Association for Computational Linguistics, Stroudsburg (2003)

    Google Scholar 

  17. Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Proceedings of the 3rd ACL Workshop on Very Large Corpora, pp. 82–94, Cambridge MA, USA (1995)

    Google Scholar 

  18. Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL ’09, pp. 147–155. Association for Computational Linguistics, Stroudsburg (2009)

    Google Scholar 

  19. Žitnik, S., Šubelj, L., Lavbič, D., Zrnec, A., Bajec, M.: Collective information extraction using first-order probabilistic models. In: Proceedings of the Fifth Balkan Conference in Informatics, BCI ’12, pp. 279–282. ACM, New York (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sandra Collovini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Collovini, S., de Bairros P. Filho, M., Vieira, R. (2015). Analysing the Role of Representation Choices in Portuguese Relation Extraction. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24027-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24026-8

  • Online ISBN: 978-3-319-24027-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics