Analysing the Role of Representation Choices in Portuguese Relation Extraction

Collovini, Sandra; de Bairros P. Filho, Marcelo; Vieira, Renata

doi:10.1007/978-3-319-24027-5_9

Analysing the Role of Representation Choices in Portuguese Relation Extraction

Sandra Collovini²¹,
Marcelo de Bairros P. Filho²¹ &
Renata Vieira²¹

Conference paper
First Online: 20 November 2015

1816 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9283))

Abstract

Relation Extraction is the task of identifying and classifying the semantic relations between entities in text. This task is one of the main challenges in Natural Language Processing. In this work, the relation extraction task is treated as sequence labelling problem. We analysed the impact of different representation schemes for the relation descriptors. In particular, we analysed the BIO and IO schemes performance considering a Conditional Random Fields classifier for the extraction of any relation descriptor occurring between named entities in the Organisation domain (Person, Organisation, Place). Overall, the classifier proposed here presents the best results using the IO notation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abreu, S.C., Bonamigo, T.L., Vieira, R.: A review on relation extraction with an eye on portuguese. Journal of the Brazilian Computer Society, 1–19 (2013)
Google Scholar
Banko, M., Etzioni, O.: The tradeoffs between open and traditional relation extraction. In: McKeown, K., Moore, J.D., Teufel, S., Allan, J., Furui, S. (eds.) ACL, pp. 28–36. The Association for Computer Linguistics (2008)
Google Scholar
Batista, D.S., Forte, D., Silva, R., Martins, B., Silva, M.: Extracção de relações semânticas de textos em português explorando a DBpédia e a Wikipédia. Linguamatica 5(1), 41–57 (2013)
Google Scholar
Bellare, K., Mccallum, A.: Learning extractors from unlabeled text using relevant databases. In: Sixth International Workshop on Information Integration on the Web (IIWeb) (2007)
Google Scholar
Bick, E.: The Parsing System Palavras. Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. University of Arhus (2000)
Google Scholar
Brucksen, M., Souza, J.G.C., Vieira, R., Rigo, S.: Sistema serelep para o reconhecimento de relações entre entidades mencionadas. In: Mota, C., Santos, D. (eds.) Segundo HAREM, chap. 14, pp. 247–260. Linguateca (2008)
Google Scholar
Cardoso, N.: Rembrandt - reconhecimento de entidades mencionadas baseado em relações e análise detalhada do texto. In: Mota, C., Santos, D. (eds.) Segundo HAREM, chap. 11, pp. 195–211. Linguateca (2008)
Google Scholar
Chaves, M.S.: Geo-ontologias e padrões para reconhecimento de locais e de suas relações em textos: o sei-geo no segundo harem. In: Mota, C., Santos, D. (eds.) Segundo HAREM, chap. 13, pp. 231–245. Linguateca (2008)
Google Scholar
Chen, Y., Zheng, Q., Wang, W., Chen, Y.: Knowledge element relation extraction using conditional random fields. In: CSCWD, pp. 245–250 (2010)
Google Scholar
Collovini, S., Pugens, L., Vanin, A.A., Vieira, R.: Extraction of relation descriptors for Portuguese using conditional random fields. In: Bazzan, A.L.C., Pichara, K. (eds.) IBERAMIA 2014. LNCS, vol. 8864, pp. 108–119. Springer, Heidelberg (2014)
Google Scholar
Culotta, A., McCallum, A., Betz, J.: Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In: Proceedings of the Main Conference on HLT-NAACL, HLT-NAACL ’06, pp. 296–303. Association for Computational Linguistics, Stroudsburg (2006)
Google Scholar
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL ’05, pp. 363–370. Association for Computational Linguistics, Stroudsburg (2005)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Prentice Hall Series in Artificial Intelligence, 2nd edn. Pearson Education Ltd., London (2009)
Google Scholar
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Google Scholar
Li, Y., Jiang, J., Chieu, H.L., Chai, K.M.A.: Extracting relation descriptors with conditional random fields. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 392–400. Asian Federation of Natural Language Processing, Chiang Mai (2011)
Google Scholar
McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, CONLL ’03, vol. 4, pp. 188–191. Association for Computational Linguistics, Stroudsburg (2003)
Google Scholar
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Proceedings of the 3rd ACL Workshop on Very Large Corpora, pp. 82–94, Cambridge MA, USA (1995)
Google Scholar
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL ’09, pp. 147–155. Association for Computational Linguistics, Stroudsburg (2009)
Google Scholar
Žitnik, S., Šubelj, L., Lavbič, D., Zrnec, A., Bajec, M.: Collective information extraction using first-order probabilistic models. In: Proceedings of the Fifth Balkan Conference in Informatics, BCI ’12, pp. 279–282. ACM, New York (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculdade de Informática, Pontifícia Universidade Católica do Rio Grande do Sul - PUCRS, Av. Ipiranga, Porto Alegre, RS, 6681, Brazil
Sandra Collovini, Marcelo de Bairros P. Filho & Renata Vieira

Authors

Sandra Collovini
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo de Bairros P. Filho
View author publications
You can also search for this author in PubMed Google Scholar
Renata Vieira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandra Collovini .

Editor information

Editors and Affiliations

Institut de Recherche en Informatique de Toulouse, Toulouse , France
Josanne Mothe
Department of Computer Science, University of Neuchatel, Neuchâtel, Switzerland
Jacques Savoy
Faculteit der Geesteswetenschappen, Universiteit Amsterdam, Amsterdam, The Netherlands
Jaap Kamps
Institut de Recherche en Informatique de Toulouse, Toulouse, France
Karen Pinel-Sauvagnat
School of Computing, Dublin City University, Dublin, Ireland
Gareth Jones
LIA - CERI, Université d'Avignon et des Pays de Vaucluse, Avignon, France
Eric San Juan
Department of Information Engineering, University of Padua, Padua, Italy
Linda Capellato
of Information Engineering (DEI), University of Padua, Department, Padova, Italy
Nicola Ferro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Collovini, S., de Bairros P. Filho, M., Vieira, R. (2015). Analysing the Role of Representation Choices in Portuguese Relation Extraction. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-24027-5_9
Published: 20 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24026-8
Online ISBN: 978-3-319-24027-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics