Comparing Two Markov Methods for Part-of-Speech Tagging of Portuguese

  • Fábio N. Kepler
  • Marcelo Finger
Conference paper

DOI: 10.1007/11874850_52

Volume 4140 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Kepler F.N., Finger M. (2006) Comparing Two Markov Methods for Part-of-Speech Tagging of Portuguese. In: Sichman J.S., Coelho H., Rezende S.O. (eds) Advances in Artificial Intelligence - IBERAMIA-SBIA 2006. Lecture Notes in Computer Science, vol 4140. Springer, Berlin, Heidelberg

Abstract

There is a wide variety of statistical methods applied to Part-of-Speech (PoS) tagging, that associate words in a text to their corresponding PoS. The majority of those methods analyse a fixed, small neighborhood of words imposing some form of Markov restriction. In this work we implement and compare a fixed length hidden Markov model (HMM) with a variable length Markov chain (VLMC); the latter is, in principle, capable of detecting long distance dependencies. We show that the VLMC model performs better in terms of accuracy and almost equally in terms of tagging time, also doing very well in training time. However, the VLMC method actually fails to capture really long distance dependencies, and we analyse the reasons for such behaviour.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Fábio N. Kepler
    • 1
  • Marcelo Finger
    • 1
  1. 1.Institute of Mathematics and StatisticsUniversity of São Paulo (USP)