POS Tagging

Daelemans, Walter

doi:10.1007/978-0-387-30164-8_643

Walter Daelemans³

341 Accesses
1 Citations

Synonyms

Grammatical tagging; Morphosyntactic disambiguation; Part of speech tagging; Tagging

Definition

Part-of-speech tagging (POS tagging) is a process in which each word in a text is assigned its appropriate morphosyntactic category (for example noun-singular, verb-past, adjective, pronoun-personal, and the like). It therefore provides information about both morphology (structure of words) and syntax (structure of sentences). This disambiguation process is determined both by constraints from the lexicon (what are the possible categories for a word?) and by constraints from the context in which the word occurs (which of the possible categories is the right one in this context?). For example, a word like table can be a noun-singular, but also a verb-present (as in I table this motion). This is lexical knowledge. It is the context of the word that should be used to decide which of the possible categories is the correct one. In a sentence like Put it on the table, the fact that table...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Recommended Reading

Brants, T. (2000). TnT – A statistical part-of-speech tagger. In Proceedings of the sixth applied natural language processing conference ANLP-2000. Seattle, WA.
Google Scholar
Brill, E. (1995a). Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Computional Linguistics, 21(4), 543–565.
Google Scholar
Brill, E. (1995b). Unsupervised learning of disambiguation rules for part of speech tagging. In Proceedings of the third workshop on very large corpora (pp. 1–13). Ohio State University, Ohio.
Google Scholar
Cussens, J. (1997). Part-of-speech tagging using progol. In N. Lavrac, & S. Dzeroski (Eds.), Proceedings of the seventh international workshop on inductive logic programming, Lecture Notes in Computer Science (Vol. 1297 pp. 93–108). London: Springer.
Google Scholar
Daelemans, W., Zavrel, J., Berck, P., & Gillis, S. (1996). MBT: A memory-based part of speech tagger generator. In Proceedings of the fourth workshop on very large corpora (pp. 14–27). Copenhagen, Denmark
Google Scholar
Garside, R., & Smith, N. (1997). A hybrid grammatical tagger: CLAWS4. In R. Garside, G. Leech, & A. McEnery (Eds.), Corpus annotation: Linguistic information from computer text corpora (pp. 102–121). London: Longman.
Google Scholar
Jurafsky, D., & Martin, J. (2008). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition (2nd ed.). Upper Saddle River, NJ: Prentice Hall.
Google Scholar
Karlsson, F., Voutilainen, A., Heikkilä, J., & Anttila, A. (1995). Constraint grammar. A language-independent system for parsing unrestricted text (p. 430). Berlin and New York: Mouton de Gruyter.
Google Scholar
Marcus, M., Santorini, B., & Marcinkiewicz, M. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.
Google Scholar
Ratnaparkhi, A. (1996). A maximum entropy part of speech tagger. In Proceedings of the ACL-SIGDAT conference on empirical methods in natural language processing (pp. 17–18). Philadelphia, PA.
Google Scholar
Schmid, H. (1994a). Part-of-speech tagging with neural networks. In Proceedings of COLING-94 (pp. 172–176). Kyoto, Japan.
Google Scholar
Schmid, H. (1994b). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the international conference on new methods in language processing (NeMLaP), (pp. 44–49). Manchester, UK.
Google Scholar
Schutze, H. (1995). Distributional part-of-speech tagging. In Proceedings of EACL 7 (pp. 141–148). Dublin, Ireland.
Google Scholar
Shen, L., Satta, G., & Joshi, A. (2007). Guided learning for bidirectional sequence classification. In Proceedings of the 45th annual meetings of the association of computational linguistics (ACL 2007) (pp. 760–767). Prague, Czech Republic.
Google Scholar
Ushioda, A. (1996). Hierarchical clustering of words and applications to NLP tasks. In Proceedings of the fourth workshop on very large corpora (pp. 28–41). Somerset, NJ.
Google Scholar
van Halteren, H. (Ed.). (1999). Syntactic wordclass tagging. Boston: Kluwer Academic Publishers.
MATH Google Scholar
van Halteren, H. Zavrel, J., & Daelemans, W. (2001) Improving accuracy in NLP through combination of machine learning systems. Computational Linguistics, 27(2), 199–229.
Article Google Scholar

Download references

Author information

Authors and Affiliations

CLIPS University of Antwerp, Antwerpen, Belgium
Walter Daelemans

Authors

Walter Daelemans
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Engineering, University of New South Wales, Sydney, Australia, 2052
Claude Sammut
Faculty of Information Technology, Clayton School of Information Technology, Monash University, P.O. Box 63, Victoria, Australia, 3800
Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Daelemans, W. (2011). POS Tagging. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_643

Download citation

DOI: https://doi.org/10.1007/978-0-387-30164-8_643
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30768-8
Online ISBN: 978-0-387-30164-8
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics