Effective Sequence Labeling with Hybrid Neural-CRF Models

da Costa, Pablo; Paetzold, Gustavo H.

doi:10.1007/978-3-319-99722-3_49

Pablo da Costa²¹ &
Gustavo H. Paetzold²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11122))

Included in the following conference series:

International Conference on Computational Processing of the Portuguese Language

799 Accesses
1 Citations

Abstract

Sequence tagging models can take many forms, each featuring strong points and limitations. In this contribution, we introduce a hybrid model for sequence tagging that combines recurrent neural networks with conditional random fields. It avoids feature engineering and addresses rare and out-of-vocabulary words by complementing typical word embeddings with compositional character-to-word representations. Using shared parameters across multiple tasks, we are able to achieve performance scores that are either superior or comparable to current state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The log probability learned from the CRF’layer is backpropagated via cross-entropy.
2.
For pre-trained word embeddings we used the ones in https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md.

References

Ammar, W., Mulcaire, G., Tsvetkov, Y., Lample, G., Dyer, C., Smith, N.A.: Massively multilingual word embeddings. arXiv preprint arXiv:1602.01925 (2016)
Berend, G.: Sparse coding of neural word embeddings for multilingual sequence labeling. arXiv preprint arXiv:1612.07130 (2016)
Cardoso, N., Santos, D.: Directivas para a identificação e classificação semântica na colecção dourada do harem (2007)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
dos Santos, C.N., Guimarães, V.: Boosting named entity recognition with neural character embeddings. CoRR, abs/1505.05008 (2015). http://arxiv.org/abs/1505.05008
Dos Santos, C.N., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, ICML 2014, vol. 32, pp. II-1818–II-1826. JMLR.org (2014). http://dl.acm.org/citation.cfm?id=3044805.3045095
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kiperwasser, E., Goldberg, Y.: Easy-first dependency parsing with hierarchical tree LSTMs. CoRR, abs/1603.00375 (2016). http://arxiv.org/abs/1603.00375
Kiperwasser, E., Goldberg, Y.: Simple and accurate dependency parsing using bidirectional LSTM feature representations. CoRR, abs/1603.04351 (2016). http://arxiv.org/abs/1603.04351
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 2001, San Francisco, CA, USA, pp. 282–289. Morgan Kaufmann Publishers Inc. (2001). http://dl.acm.org/citation.cfm?id=645530.655813. ISBN 1-55860-778-1
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–3119 (2013)
Google Scholar
Nguyen, D.Q., Dras, M., Johnson, M.: A novel neural network model for joint POS tagging and graph-based dependency parsing. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 134–142 (2017). http://www.aclweb.org/anthology/K17-3014
Nivre, J., et al.: Universal dependencies 1.2 (2015)
Google Scholar
Plank, B., Søgaard, A., Goldberg, Y.: Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. arXiv preprint arXiv:1604.05529 (2016)
Tsarfaty, R., Seddah, D., Kübler, S., Nivre, J.: Parsing morphologically rich languages: introduction to the special issue. Comput. Linguist. 39(1), 15–22 (2013)
Article Google Scholar
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Multi-task cross-lingual sequence tagging from scratch. CoRR, abs/1603.06270 (2016). http://arxiv.org/abs/1603.06270
Yasunaga, M., Kasai, J., Radev, D.: Robust multilingual part-of-speech tagging via adversarial training. arXiv preprint arXiv:1711.04903 (2017)

Download references

Author information

Authors and Affiliations

Accenture Brazil, São Paulo, Brazil
Pablo da Costa
University of Sheffield, Sheffield, UK
Gustavo H. Paetzold

Authors

Pablo da Costa
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo H. Paetzold
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Pablo da Costa or Gustavo H. Paetzold .

Editor information

Editors and Affiliations

Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
Aline Villavicencio
Instituto de Informática - UFRGS, Porto Alegre, Brazil
Viviane Moreira
INESC-ID, Lisbon, Portugal
Alberto Abad
UFSCAR, Sao Carlos, Brazil
Helena Caseli
Centro Singular de Investigación en Tecnoloxías, Universidade de Santiago de Compostela, Santiago de Compostela, La Coruña, Spain
Pablo Gamallo
Université de Toulon, Parc Scientifique Technologique Luminy, Marseille, France
Carlos Ramisch
Centro de Informática e Sistemas, Universidade de Coimbra, Coimbra, Portugal
Hugo Gonçalo Oliveira
Federal University of Technology, Dois Vizinhos, Paraná, Brazil
Gustavo Henrique Paetzold

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

da Costa, P., Paetzold, G.H. (2018). Effective Sequence Labeling with Hybrid Neural-CRF Models. In: Villavicencio, A., et al. Computational Processing of the Portuguese Language. PROPOR 2018. Lecture Notes in Computer Science(), vol 11122. Springer, Cham. https://doi.org/10.1007/978-3-319-99722-3_49

Download citation

DOI: https://doi.org/10.1007/978-3-319-99722-3_49
Published: 26 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99721-6
Online ISBN: 978-3-319-99722-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics