Skip to main content

Unsupervised Learning of Disambiguation Rules for Part-of-Speech Tagging

  • Chapter
Natural Language Processing Using Very Large Corpora

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 11))

Abstract

In this paper we describe an unsupervised learning algorithm for automatically training a rule-based part of speech tagger without using a manually tagged corpus. We compare this algorithm to the Baum-Welch algorithm, used for unsupervised training of stochastic taggers. Next, we show a method for combining unsupervised and supervised rule-based training algorithms to create a highly accurate tagger using only a small amount of manually tagged text1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baum, L. 1972. An inequality and associated maximization technique in statistical estimation for probabilistic functions of a Markov process. Inequalities, 3: 1–8.

    Google Scholar 

  • Black, E., Jelinek, F., Lafferty, J., Mercer, R. and Roukos, S. 1992. Decision tree models applied to the labeling of text with parts-of-speech. In Darpa Workshop on Speech and Natural Language Harriman, N.Y.

    Google Scholar 

  • Brill, E. and Resnik, P. 1994. A transformation-based approach to prepositional phrase attachment disambiguation. In Proceedings of the Fifteenth International Conference on Computational Linguistics (COLING-1994),Kyoto, Japan.

    Google Scholar 

  • Brill, E. 1993. Automatic grammar induction and parsing free text: A transformation-based approach. In Proceedings of the 31st Meeting of the Association of Computational Linguistics, Columbus, OH, pp. 259–265.

    Google Scholar 

  • Brill, E. 1995. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 21 (4): 543–565.

    Google Scholar 

  • Charniak, E., Hendrickson, C.,.Jacobson, N. and Perkowitz, M. 1993. Equations for part. of speech tagging. In Proceedings of the Conference of the American A.s.sociation for Artificial Intelligence (AA AI-93)

    Google Scholar 

  • Church, K. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In Proceedings of the Second Conference on Applied Natural Language Processing, ACL, pp. 136–143.

    Google Scholar 

  • Cutting, D., Kupiec, J., Pedersen, J. and Sibun, P. 1992. A practical part-of-speech tagger. In Proceedings of the Third Conference on Applied Natural Language. Processing, ACL, Trento, Italy, pp. 133–140.

    Google Scholar 

  • DeMarcken, C. 1990. Parsing the lob corpus. In Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics, pp. 243–251.

    Google Scholar 

  • DeR.ose, S. 1988. Grammatical category disambiguation by statistical optimization. Computational Linguistics, 14 (1): 31–39.

    Google Scholar 

  • Elworthy, D. 1994. Does Baum-Welch re-estimation help taggers. In Proceedings of the Fourth Conference on Applied Natural Language Processing, ACL. Stuttgart, Germany, pp. 53–58.

    Google Scholar 

  • Francis, W. and Kucera, H. 1982. Frequency analysis of English usage: Lexicon, and grammar. Houghton Mifflin, Boston.

    Google Scholar 

  • Green, B. and Rubin, G. 1971. Automated grammatical tagging of english. Technical report, Department of Linguistics, Brown University.

    Google Scholar 

  • Harris, Z. 1962. String Analysis of Language Structure. Mouton and Co., The Hague.

    Google Scholar 

  • Hindle, D. 1989. Acquiring disambiguation rules from text. In Proceedings of the 27th. Annual Meeting of the Association for Computational Linguistics, pp. 118 125.

    Google Scholar 

  • Huang, C’., Son-Bell, M. and Baggett, D. 1994. Generation of pronunciations from orthographies using transformation-based error-driven learning. In International Conference on Speech and Language Processing (ICSLP) Yokohama, Japan.

    Google Scholar 

  • Ielinek, F. 1985. Self-Organized Language Modelling for Speech Recognition. Nijhoff, Dordrecht. In J. Skwirzinski (ed). Impact of Processing Techniques on Communication

    Google Scholar 

  • Klein, S. and Simmons, R. 1963. A computational approach to grammatical coding of English words. JA CM, 10.

    Google Scholar 

  • Kupiec, J. 1992. Robust part-of-speech tagging using a hidden Markov model. Computer Speech and Language, 6.

    Google Scholar 

  • Lin, Y., Chiang, T. and Su, K. 1994. Automatic model refinement with an application to tagging. In Proceedings of the 15th International Conference on Computational Linguistics

    Google Scholar 

  • Marcus, M., Santorini, B. and Marcinkiewicz, M. 1993. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19 (2): 313–330.

    Google Scholar 

  • Merialdo, B. 1994. Tagging english text with a probabilistic model. Computational Linguistics, 20 (2): 155–171.

    Google Scholar 

  • R.amshaw, L. and Marcus, M. 1994. Exploring the statistical derivation of transformational rule sequences for part-of-speech tagging. In The Balancing Act: Proceedings of the ACL Workshop on Combining Symbolic and Statistical Approaches to Language, New Mexico State University, pp. 86–95.

    Google Scholar 

  • Roche, E. and Schabes, Y. 1995. Deterministic part of speech tagging with finite state transducers. Computational Linguistics, 21 (2): 227–253.

    Google Scholar 

  • Schutze, H. and Singer, Y. 1994. Part of speech tagging using a variable memory Markov model. In Proceedings of the Association for Computational Linguistics, Las Cruces, NM, pp. 181–187.

    Google Scholar 

  • Weischedel, R., Meteer, M., Schwartz, R., Ramshaw, L. and Palmucci, J. 1993. Coping with ambiguity and unknown words through probabilistic models. Computational Linguistics, 19 (2): 359–382.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Brill, E., Pop, M. (1999). Unsupervised Learning of Disambiguation Rules for Part-of-Speech Tagging. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds) Natural Language Processing Using Very Large Corpora. Text, Speech and Language Technology, vol 11. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2390-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-2390-9_3

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-5349-7

  • Online ISBN: 978-94-017-2390-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics