Encyclopedia of Machine Learning

2010 Edition
| Editors: Claude Sammut, Geoffrey I. Webb

POS Tagging

  • Walter Daelemans
Reference work entry
DOI: https://doi.org/10.1007/978-0-387-30164-8_643



Part-of-speech tagging (POS tagging) is a process in which each word in a text is assigned its appropriate morphosyntactic category (for example noun-singular, verb-past, adjective, pronoun-personal, and the like). It therefore provides information about both morphology (structure of words) and syntax (structure of sentences). This disambiguation process is determined both by constraints from the lexicon (what are the possible categories for a word?) and by constraints from the context in which the word occurs (which of the possible categories is the right one in this context?). For example, a word like table can be a noun-singular, but also a verb-present (as in I table this motion). This is lexical knowledge. It is the context of the word that should be used to decide which of the possible categories is the correct one. In a sentence like Put it on the table, the fact that table...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. Brants, T. (2000). TnT – A statistical part-of-speech tagger. In Proceedings of the sixth applied natural language processing conference ANLP-2000. Seattle, WA.Google Scholar
  2. Brill, E. (1995a). Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Computional Linguistics, 21(4), 543–565.Google Scholar
  3. Brill, E. (1995b). Unsupervised learning of disambiguation rules for part of speech tagging. In Proceedings of the third workshop on very large corpora (pp. 1–13). Ohio State University, Ohio.Google Scholar
  4. Cussens, J. (1997). Part-of-speech tagging using progol. In N. Lavrac, & S. Dzeroski (Eds.), Proceedings of the seventh international workshop on inductive logic programming, Lecture Notes in Computer Science (Vol. 1297 pp. 93–108). London: Springer.Google Scholar
  5. Daelemans, W., Zavrel, J., Berck, P., & Gillis, S. (1996). MBT: A memory-based part of speech tagger generator. In Proceedings of the fourth workshop on very large corpora (pp. 14–27). Copenhagen, DenmarkGoogle Scholar
  6. Garside, R., & Smith, N. (1997). A hybrid grammatical tagger: CLAWS4. In R. Garside, G. Leech, & A. McEnery (Eds.), Corpus annotation: Linguistic information from computer text corpora (pp. 102–121). London: Longman.Google Scholar
  7. Jurafsky, D., & Martin, J. (2008). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition (2nd ed.). Upper Saddle River, NJ: Prentice Hall.Google Scholar
  8. Karlsson, F., Voutilainen, A., Heikkilä, J., & Anttila, A. (1995). Constraint grammar. A language-independent system for parsing unrestricted text (p. 430). Berlin and New York: Mouton de Gruyter.Google Scholar
  9. Marcus, M., Santorini, B., & Marcinkiewicz, M. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.Google Scholar
  10. Ratnaparkhi, A. (1996). A maximum entropy part of speech tagger. In Proceedings of the ACL-SIGDAT conference on empirical methods in natural language processing (pp. 17–18). Philadelphia, PA.Google Scholar
  11. Schmid, H. (1994a). Part-of-speech tagging with neural networks. In Proceedings of COLING-94 (pp. 172–176). Kyoto, Japan.Google Scholar
  12. Schmid, H. (1994b). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the international conference on new methods in language processing (NeMLaP), (pp. 44–49). Manchester, UK.Google Scholar
  13. Schutze, H. (1995). Distributional part-of-speech tagging. In Proceedings of EACL 7 (pp. 141–148). Dublin, Ireland.Google Scholar
  14. Shen, L., Satta, G., & Joshi, A. (2007). Guided learning for bidirectional sequence classification. In Proceedings of the 45th annual meetings of the association of computational linguistics (ACL 2007) (pp. 760–767). Prague, Czech Republic.Google Scholar
  15. Ushioda, A. (1996). Hierarchical clustering of words and applications to NLP tasks. In Proceedings of the fourth workshop on very large corpora (pp. 28–41). Somerset, NJ.Google Scholar
  16. van Halteren, H. (Ed.). (1999). Syntactic wordclass tagging. Boston: Kluwer Academic Publishers.MATHGoogle Scholar
  17. van Halteren, H. Zavrel, J., & Daelemans, W. (2001) Improving accuracy in NLP through combination of machine learning systems. Computational Linguistics, 27(2), 199–229.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Walter Daelemans
    • 1
  1. 1.CLIPS University of AntwerpAntwerpenBelgium