Skip to main content

Part-of-Speech Tagging with Evolutionary Algorithms

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2276))

Abstract

This paper presents a part-of-speech tagger based on a genetic algorithm which, after the “evolution” of a population of sequences of tags for the words in the text, selects the best individual as solution. The paper describes the main issues arising in the algorithm, such as the chromosome representation and the evaluation and design of genetic operators for crossover and mutation. A probabilistic model, based on the context of each word (the tags of the surrounding words) has been devised in order to define the fitness function. The model has been implemented and different issues have been investigated: size of the training corpus, effect of the context size, and parameters of the evolutionary algorithm, such as population size and crossover and mutation rates. The accuracy obtained with this method is comparable to that of other probabilistic approaches, but evolutionary algorithms are more efficient in obtaining the results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. Araujo. Evolutionary parsing for a probabilistic context free grammar. In Proc. of the Int. Conf. on on Rough Sets and Current Trends in Computing (RSCTC-2000), 2000.

    Google Scholar 

  2. E. Brill. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 21(4), 1995.

    Google Scholar 

  3. E. Brill. Unsupervised learning of disambiguation rules for part of speech tagging. In S. Armstrong, K. Church, P. Isabelle, S. Manzi, E. Tzoukermann, and D. Yarowsky, editors, Natural Language Processing Using Very Large Corpora. Kluwer Academic Press, 1997.

    Google Scholar 

  4. E. Charniak. Statistical Language Learning. MIT press, 1993.

    Google Scholar 

  5. D. Cutting, J. Kupiec, J. Pedersen, and P. Sibun. A practical part-ofspeech tagger. In Proc. of the Third Conf. on Applied Natural Language Processing. Association for Computational Linguistics, 1992.

    Google Scholar 

  6. C. DeMarcken. Parsing the lob corpus. In Proc. of the 1990 of the Association for Computational Linguistics. Association for Computational Linguistics, 1990.

    Google Scholar 

  7. F. Jelinek. Self-organized language modelling for speech recognition. In J. Skwirzinski, editor, Impact of Processing Techniques on Communications. Dordrecht, 1985.

    Google Scholar 

  8. T. Dunning M. Davis. Query translation using evolutionary programming for multilingual information retrieval II. In Proc. of the Fifth Annual Conf. on Evolutionary Programming. Evolutionary Programming Society, 1996.

    Google Scholar 

  9. B. Merialdo. Tagging english text with a probabilistic model. 1994.

    Google Scholar 

  10. Z. Michalewicz. Genetic algorithms + Data Structures = Evolution Programs. Springer-Verlag, 2nd edition, 1994.

    Google Scholar 

  11. J.R. Quinlan. C 4.5: Programs for Machine Learning. Morgan Kaufmann Publisher, 1993.

    Google Scholar 

  12. H. Schutze and Y. Singer. Part od speech tagging using a variable memory markov model. In Proc. of the 1994 of the Association for Computational Linguistics. Association for Computational Linguistics, 1994.

    Google Scholar 

  13. I.H. Witten T.C. Smith. A genetic algorithm for the induction of natural language grammars. In Proc. IJCAI-95 Workshop on New Approaches to Learning Natural Language, pages 17–24, Montreal, Canada, 1995.

    Google Scholar 

  14. P. Wyard. Context free grammar induction using genetic algorithms. In Proc. of the 4th Int. Conf. on Genetic Algorithms, pages 514–518, 1991.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Araujo, L. (2002). Part-of-Speech Tagging with Evolutionary Algorithms. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2002. Lecture Notes in Computer Science, vol 2276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45715-1_21

Download citation

  • DOI: https://doi.org/10.1007/3-540-45715-1_21

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43219-7

  • Online ISBN: 978-3-540-45715-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics