Artificial Intelligence Review

, Volume 28, Issue 4, pp 275–303 | Cite as

How evolutionary algorithms are applied to statistical natural language processing

  • Lourdes AraujoEmail author


Statistical natural language processing (NLP) and evolutionary algorithms (EAs) are two very active areas of research which have been combined many times. In general, statistical models applied to deal with NLP tasks require designing specific algorithms to be trained and applied to process new texts. The development of such algorithms may be hard. This makes EAs attractive since they offer a general design, yet providing a high performance in particular conditions of application. In this article, we present a survey of many works which apply EAs to different NLP problems, including syntactic and semantic analysis, grammar induction, summaries and text generation, document clustering and machine translation. This review finishes extracting conclusions about which are the best suited problems or particular aspects within those problems to be solved with an evolutionary algorithm.


Evolutionary algorithms Statistical natural language processing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Alba E, Luque G, Araujo L (2006) Natural language tagging with genetic algorithms. Inf Process Lett 100(5): 173–182CrossRefMathSciNetGoogle Scholar
  2. Andersson L (2004) Search and creative summarization using genetic algorithms. Master Thesis. URL:
  3. Araujo L (2002) Part-of-speech tagging with evolutionary algorithms. In: Proceedings. of the international conference on intelligent text processing and computational linguistics (CICLing-2002), Lecture Notes in Computer Science 2276, Springer, pp 230–239Google Scholar
  4. Araujo L (2004a) Genetic programming for natural language parsing. In: Proceedings of the European conference on genetic programming (EuroGP2004), Lecture Notes in Computer Science 3003, Springer, pp 230–239Google Scholar
  5. Araujo L (2004b) A probabilistic chart parser implemented with an evolutionary algorithm. In: CICLing, pp 81–92Google Scholar
  6. Araujo L (2004) Symbiosis of evolutionary techniques and statistical natural language processing. IEEE Trans Evol Comput 8(1): 14–27CrossRefMathSciNetGoogle Scholar
  7. Araujo L, Luque G, Alba E (2004) Metaheuristics for natural language tagging. In: GECCO (1), pp 889–900Google Scholar
  8. Brill E (1995) Transformation–based Error–driven Learning and Natural Language Processing: A Case Study in Part–of–speech Tagging. Comput Linguist 21(4): 543–565Google Scholar
  9. Brown PF, Pietra SD, Pietra VJD, Mercer RL (1994) The mathematic of statistical machine translation: Parameter estimation. Comput Linguist 19(2):263–311, URL: Google Scholar
  10. Brücher H, Knolmayer G, Mittermayer M (2002) Document classification methods for organizing explicit knowledge. In: Proceedings of the third european conference on organizational knowledge, learning, and capabilities, Athens, GreeceGoogle Scholar
  11. Calinski T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat 3(1): 1–27CrossRefMathSciNetGoogle Scholar
  12. Casillas A, de Lena MTG, Martínez R (2003) Document clustering into an unknown number of clusters using a genetic algorithm. In: TSD, pp 43–49Google Scholar
  13. Casillas A, de Lena MTG, Martínez R (2004) Sampling and feature selection in a genetic algorithm for document clustering. In: CICLing, pp 601–612Google Scholar
  14. Charniak E (1993) Statistical language learning. MIT press, Cambridge, MAGoogle Scholar
  15. Charniak E (1997) Statistical parsing with a context-free grammar and word statistics. In: Proceedings of the 14th national conference on artificial intelligence, AAAI Press/MIT Press, pp 598–603Google Scholar
  16. Charniak E (2000) A maximum-entropy-inspired parser. In: Proceedings of the first conference on North American chapter of the association for computational linguistics, Morgan Kaufmann, San Francisco, pp 132–139Google Scholar
  17. Chomsky N (1957) Syntactic structures. Mouton and Co, The HagueGoogle Scholar
  18. Clark A, Coste F, Miclet L (eds) (2008) Grammatical inference: algorithms and applications, 9th International Colloquium, ICGI 2008, Saint-Malo, France, September 22–24, 2008, Proceedings, Lecture Notes in Computer Science, vol 5278, SpringerGoogle Scholar
  19. Collins M (1997) Three generative, lexicalised models for statistical parsing. In: Proceedings of the annual meeting of the association for computational linguistics, Association for Computational Linguistics, pp 16–23Google Scholar
  20. Collins M (1999) Head-driven statistical models for natural language parsing. Ph.D. Dissertation, University of PennsylvaniaGoogle Scholar
  21. Cordón O, Herrera-Viedma E, López-Pujalte C, Luque M, Zarco C (2003) A review on the application of evolutionary computation to information retrieval. Int J Approx Reason 34(2–3): 241–264zbMATHCrossRefGoogle Scholar
  22. Curran JR, Clark S, Bos J (2007) Linguistically motivated large-scale nlp with c&c and boxer. In: ACLGoogle Scholar
  23. De Pauw G (2003a) Evolutionary computing as a tool for grammar development. In: Proceedings of GECCO 2003, LNCS 2723, Springer, Berlin, pp 549–560,
  24. De Pauw G (2003b) Grael: an agent-based evolutionary computing approach for natural language grammar development. In: Proceedings of the eighteenth international joint conference on artificial intelligence, Acapulco, Mexico, pp 823–828,
  25. Decadt B, Hoste V, Daelemans W, van den Bosch A (2004) Gambl, genetic algorithm optimization of memory-based wsd. In: Mihalcea R, Edmonds P (eds) Proceedings of the third international workshop on the evaluation of systems for the semantic analysis of text (Senseval-3), ACL, Barcelona, pp 108–112,
  26. de la Higuera C, Oates T, Paliouras G, van Zaanen M (eds) (2005) In: Proceedings of the IJCAI workshop on grammatical inference applications: successes and future challenges, (Held together with the 19th international joint conference on artificial intelligence), EdinburghGoogle Scholar
  27. del Castillo MD, Serrano JI (2004) A multistrategy approach for digital text categorization from imbalanced documents. SIGKDD Explor 6(1): 70–79CrossRefGoogle Scholar
  28. Echizen-ya H, Araki K, Momouchi Y, Tochinai K (1996) Machine translation method using inductive learning with genetic algorithms. In: COLING, pp 1020–1023Google Scholar
  29. Estivill-Castro V, Murray A (1997) Spatial clustering for data mining with genetic algorithms. Tech Rep FIT-TR-97-10, URL:
  30. Forney GD (1973) The viterbi algorithm. Proc IEEE 61(3): 268–278CrossRefMathSciNetGoogle Scholar
  31. Fu KS, Booth TL (1986) Grammatical inference: introduction and survey_part i. IEEE Trans Pattern Anal Mach Intell 8(3): 343–359zbMATHCrossRefGoogle Scholar
  32. Fu KS, Booth TL (1986) Grammatical inference: introduction and survey_part ii. IEEE Trans Pattern Anal Mach Intell 8(3): 360–375zbMATHCrossRefGoogle Scholar
  33. Gelbukh A, Sidorov G, Han SY (2003) Evolutionary approach to natural language word sense disambiguation through global coherence optimization. WSEAS Trans Commun 2(1): 11–19Google Scholar
  34. Gold EM (1967) Language identification in the limit. Inf Control 10(5): 447–474zbMATHCrossRefGoogle Scholar
  35. Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison Wesley, BostonzbMATHGoogle Scholar
  36. Grosz BJ, Weinstein S, Joshi AK (1995) Centering: a framework for modeling the local coherence of discourse. Comput Linguist 21(2): 203–225Google Scholar
  37. Han B (2001) Building a bilingual dictionary with scarce resources: a genetic algorithm approach. In: Student research workshop, the second meeting of the North American chapter of the association for computational linguistics (NAACL-2001)Google Scholar
  38. Hervás R, Gervás P (2005) Applying genetic algorithms to referring expression generation. In: Tenth international conference on computer aided systems theory, EUROCAST2005Google Scholar
  39. Holland JJ (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann ArborGoogle Scholar
  40. Hopcroft JE, Ullman JD (1990) Introduction to automata theory, languages, and computation. Addison-Wesley Longman, BostonGoogle Scholar
  41. Hutchins W, Somers H (1991) An introduction to machine translation. Academic Press, LondonGoogle Scholar
  42. Kammeyer TE, Belew RK (1996) Stochastic context-free grammar induction with a genetic algorithm using local search. In: FOGA, pp 409–436Google Scholar
  43. Karamanis N, Manurung HM (2002) Stochastic text structuring using the principle of continuity. In: Proceedings of the second international natural language generation conference (INLG-02), Association for Computational Linguistics, Harriman, NY, pp 81–88, URL:
  44. Kazakov D (1997) Unsupervised learning of naive morphology with genetic algorithms. In: Workshop notes of the ECML/MLnet workshop on empirical learning of natural language processing tasksGoogle Scholar
  45. Kazakov D, Manandhar S (1998) A hybrid approach to word segmentation. In: ILP’98: Proceedings of the 8th international workshop on inductive logic programming, Springer, London, pp 125–134Google Scholar
  46. Keller B, Lutz R (1997) Evolving stochastic context-free grammars from examples using a minimum description length principle. In: Workshop on automata induction grammatical inference and language acquisition, ICML-97., URL:
  47. Keller B, Lutz R (2005) Evolutionary induction of stochastic context free grammars. Pattern Recognition 38(9): 1393–1406zbMATHCrossRefGoogle Scholar
  48. Kim KM, Lim SS, Cho SB (2004) User adaptive answers generation for conversational agent using genetic programming. In: Proceedings of intelligent data engineering and automated learning (IDEAL-04), LNCS 3177, pp 813–819Google Scholar
  49. Kool A (2000) Literature survey. URL:
  50. Korkmaz EE, Ucoluk G (2001) Genetic programming for grammar induction. In: Goodman ED (ed) 2001 Genetic and evolutionary computation conference late breaking papers, San Francisco, pp 245–251, URL:
  51. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MAzbMATHGoogle Scholar
  52. Lankhorst M (1995) Automatic word categorization with genetic algorithms. In: Proceedings of the ECAI’94, workshop on applied genetic and other evolutionary algorithms Amsterdam. Springer, URL:
  53. Lankhorst MM (1994) Breeding grammars. Grammatical inference with a genetic algorithm. Technical Report, Dept. of CS. University of Groningen, GroningenGoogle Scholar
  54. Larrañaga P, Lozano J (2002) Estimation of distribution algorithms, a new tool for evolutionnary computation. Kluwer, DordrechtGoogle Scholar
  55. Lavie A (1995) A grammar based robust parser for spontaneous speech. Ph.D. thesis, School of Computer Science, Carnegie Mellon UniversityGoogle Scholar
  56. Lehman JF (1989) Adaptive parsing: self-extending natural language interfaces. Ph.D. thesis, School of Computer Science, Carnegie Mellon UniversityGoogle Scholar
  57. Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: SIGDOC ’86: proceedings of the 5th annual international conference on Systems documentation, ACM Press, New York, pp 24–26,
  58. Lopez A (2008) Statistical machine translation. ACM Comput Surv 40(3):1–49,
  59. Losee RM (1995) Determining information retrieval and filtering performance without experimentation. Inf Process Manag 31(4): 555–572CrossRefGoogle Scholar
  60. Losee RM (1996) Learning syntactic rules and tags with genetic algorithms for information retrieval and filtering: an empirical basis for grammatical rules. Inf Process Manag 32(2):185–197, URL:
  61. MacKay DJC (2003) Information theory, inference, and learning algorithms. Cambridge University Press, URL:, available from
  62. Mani I, Maybury MT (1999) Advances in automatic text summarization. MIT Press, Cambridge, MAGoogle Scholar
  63. Manning CD, Schütze H (2000) Foundations of statistical natural language processing. MIT Press, Cambridge, MAGoogle Scholar
  64. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, URL:
  65. Manurung H (2003) An evolutionary algorithm approach to poetry generation. Ph.D. thesisGoogle Scholar
  66. Marcus MP, Santorini B, Marcinkiewicz MA (1994) Building a large annotated corpus of English: the penn treebank. Comput Linguist 19(2):313–330,URL: Google Scholar
  67. Michalewicz Z (1994) Genetic algorithms + data structures = evolution programs, 2nd edn. Springer, New YorkzbMATHGoogle Scholar
  68. Nabhan AR, Rafea AA (2005) Tuning statistical machine translation parameters using perplexity. In: IRI, pp 338–343Google Scholar
  69. Orăsan C (2003) An evolutionary approach for improving the quality of automatic summaries. In: Proceedings of the ACL 2003 workshop on multilingual summarization and question answering, Association for Computational Linguistics, Morristown, pp 37–45,
  70. Otto E, Rojas MCR (2004) Towards an efficient evolutionary decoding algorithm for statistical machine translation. In: MICAI, pp 438–447Google Scholar
  71. Parekh R, Honavar V (2000) Grammar inference, automata induction, and language acquisition. In: Dale R, Moisle H, Somers H (eds) Handbook of natural language processing, Marcel Dekker, New York, URL:
  72. Pollard CJ, Sag IA (1994) Head-driven phrase structure grammar. University of Chicago Press, ChicagoGoogle Scholar
  73. Radcliffe NJ (1992) Genetic set recombination. In: FOGA, pp 203–219Google Scholar
  74. Ratnaparkhi A (1997) A linear observed time statistical parser based on maximal entropy models. In: Cardie C Weischedel R (eds) Proceedings of the second conference on empirical methods in natural language processing, Association for Computational Linguistics, Somerset, New Jersey, pp 1–10, URL:
  75. Reiter E, Dale R (1997) Building applied natural language generation systems. Nat Lang Eng 3(1):57–87,
  76. Rodríguez L, Varea IG, Gámez JA (2006) Searching for alignments in smt. A novel approach based on an estimation of distribution algorithm. In: Proceedings on the workshop on statistical machine translation, Association for Computational Linguistics, New York City, pp 47–54,
  77. Rose CP (1999) A genetic programming approach for robust language interpretation. In: Spector L, Langdon WB, O’Reilly UM, Angeline PJ (eds) Advances in genetic programming 3, MIT Press, Cambridge, chapter 4, pp 67–88,
  78. Sampson G (1995) English for the computer. Clarendon Press, OxfordGoogle Scholar
  79. Sang EFTK (2002) Memory-based shallow parsing. J Mach Learn Res 2: 559–594zbMATHCrossRefGoogle Scholar
  80. Sarkar M, Yegnanarayana B, Khemani D (1997) A clustering algorithm using an evolutionary programming-based approach. Pattern Recogn Lett 18(10): 975–986CrossRefGoogle Scholar
  81. Schabes Y (1990) Mathematical and computational aspects of lexicalized grammars. PhD thesis, University of Pennsylvania, Philadelphia, available as technical report (MS-CIS-90-48, LINC LAB179) from the Department of Computer ScienceGoogle Scholar
  82. Serrano JI, Araujo L (2005) Evolutionary algorithm for noun phrase detection in natural language processing. In: Congress on evolutionary computation (CEC), pp 640–647Google Scholar
  83. Sheng W, Liu X (2004) A hybrid algorithm for k-medoid clustering of large data sets. In: Proceedings of the 2004 IEEE congress on evolutionary computation, IEEE Press, pp 77–82Google Scholar
  84. Smith T, Witten I (1995) A genetic algorithm for the induction of natural language grammars. In: Proceedings IJCAI-95 workshop on new approaches to learning natural language, Montreal, pp 17–24Google Scholar
  85. Spärck Jones K (2007) Automatic summarising: the state of the art. Inf Process Manag 43(6):1449–1481,
  86. Steedman M (2000) The syntactic process. MIT Press, CambridgeGoogle Scholar
  87. van Zaanen M, de la Higuera C (2009) Grammatical inference and computational linguistics. In: Proceedings of the EACL 2009 workshop on computational linguistic aspects of grammatical inference, Association for Computational Linguistics, Athens, pp 1–4,
  88. Wilson G, Heywood M (2005) Use of a genetic algorithm in brill’s transformation-based part-of-speech tagger. In: GECCO ’05: proceedings of the 2005 conference on genetic and evolutionary computation, ACM, New York, pp 2067–2073,
  89. Wyard PJ (1991) Context free grammar induction using genetic algorithms. In: ICGA, pp 514–519Google Scholar
  90. Wyard PJ (1994) Representational issues for context free grammar induction using genetic algorithms. In: ICGI ’94: Proceedings of the second international colloquium on grammatical inference and applications, Springer, London, pp 222–235Google Scholar
  91. Yeh JY, Ke HR, Yang WP (2002) Chinese text summarization using a trainable summarizer and latent semantic analysis. In: ICADL ’02: Proceedings of the 5th international conference on Asian digital libraries, Springer, London, pp 76–87Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  1. 1.Dpto. de Lenguajes y Sistemas InformáticosUniversidad Nacional de Educación a Distancia (UNED)MadridSpain

Personalised recommendations