Skip to main content


Log in

A survey of grammatical inference methods for natural language learning

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript


The high complexity of natural language and the huge amount of human and temporal resources necessary for producing the grammars lead several researchers in the area of Natural Language Processing to investigate various solutions for automating grammar generation and updating processes. Many algorithms for Context-Free Grammar inference have been developed in the literature. This paper provides a survey of the methodologies for inferring context-free grammars from examples, developed by researchers in the last decade. After introducing some preliminary definitions and notations concerning learning and inductive inference, some of the most relevant existing grammatical inference methods for Natural Language are described and classified according to the kind of presentation (if text or informant) and the type of information (if supervised, unsupervised, or semi-supervised). Moreover, the state of the art of the strategies for evaluation and comparison of different grammar inference methods is presented. The goal of the paper is to provide a reader with introduction to major concepts and current approaches in Natural Language Learning research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others


  • Adriaans PW (2001) Learning shallow context-free languages under simple distributions. In: Opestake A, Vermeulen K (eds) Algebras, diagrams and decisions in language, logic and computation, CSLI/CUP

  • Adriaans PW (1992) Language learning from a categorial perspective. PhD thesis, University of Amsterdam, Amsterdam

  • Adriaans PW, Vervoort M (2002) The EMILE 4.1 grammar induction toolbox. In: Adriaans P, Fernau H, van Zaanen M (eds) Grammatical inference: algorithms and applications: 6th international colloquium: ICGI 2002. Lecture notes in computer science, vol 2484. Springer, Heidelberg, pp 293–295

  • Angluin D (1982) Inference of reversible languages. J ACM 29: 741–765

    Article  MathSciNet  MATH  Google Scholar 

  • Baker JK (1979) Trainable grammars for speech recognition. In: Klatt DH, Wolf JJ (eds) Speech communication papers for the 97th meeting of the acoustical society of America, pp 547–550

  • Black E, Abney S, Flickinger D, Gdaniec C, Grishman R, Harrison P, Hindle D, Ingria R, Jelinek F, Klavans J, Liberman M, Marcus M, Roukos S, Santorini B, Strzalkowski T (1991) A procedure for quantitatively comparing the syntactic coverage of English grammars. In: Proceedings of the DARPA speech and natural language workshop, pp 306–311

  • Blum A, Mitchell T (1998) Combining labeled and unlabeled data with cotraining. In: Proceedings of the workshop on computational learning theory

  • Bonnema R, Bod R, Scha R (1997) A DOP model for semantic interpretation. In: ACL 1997, pp 159–167

  • Briscoe T (2000) Grammatical acquisition: inductive bias and coevolution of language and the language acquisition device. Language, pp 245–296

  • Charniak E, Johnson M (2005) Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In: Proceedings of the 43rd annual meeting of the ACL, Ann Arbor, pp 173–180

  • Charniak E (1997) Statistical parsing with a context-free grammar and word statistics. In: Proceedings of the fourteenth national conference on artificial intelligence, Menlo Park. AAAI Press/MIT Press

  • Chomsky N (1957) Syntactic Structures. The Hague Mouton.

  • Clark A (2001) Unsupervised induction of stochastic context-free grammars using distributional clustering. In: ConLL ‘01: Proceedings of the 2001 workshop on computational natural language learning, Morristown, NJ, USA. Association for Computational Linguistics, pp 1–8

  • Cramer B (2007) Limitations of current grammar induction algorithms. In: Proceedings of the 45th annual meeting of the ACL: student research workshop, June 25–26, 2007, Prague, Czech Republic

  • Déjean H (2000) ALLiS: a symbolic learning system for natural language learning. In: Cardie C, Daelemans W, N’edellec C, Tjong Kim Sang E (eds) Proceedings of the fourth conference on computational natural language learning and of the second learning language in logic workshop; Lisbon, Portugal. Held in cooperation with ICGI-2000, pp 95–98

  • de la Higuera C, Oncina J (2003) Identification with Probability One of Stochastic Deterministic Linear Languages. In: Proceedings of ALT 2003. Springer, Berlin, Heidelberg, pp 134–148

  • Denis F (1998) Pac learning from positive statistical queries. In: Proceedings of 9th international conference on algorithmic learning theory—ALT ‘98, Springer, pp 112–126

  • Edelman S, Solan Z, Horn D, Ruppin E (2005) Learning syntactic constructions from raw corpora. In: 29th Boston University conference on language development, Cascadilla Press

  • Emerald JD, Subramanian KG, Thomas DG (1996) Learning code regular and code linear languages. In: Proceedings of international colloquium on grammatical inference (ICGI-96), lecture notes in artificial intelligence 1147, Springer, pp 211–221

  • Garcia P, Vidal E (1990) Inference of K-testable languages in the strict sense and applications to syntactic pattern recognition. J IEEE Trans Pattern Anal Mach Intell 12(9): 920–925

    Article  Google Scholar 

  • Gold EM (1967) Language identification in the limit. Inform Control 10: 447–474

    Article  MATH  Google Scholar 

  • Hänig C, Bordag S, Quasthoff U (2008) UnsuParse: unsupervised parsing with unsupervised part of speech tagging. In: Proceedings of the sixth international language resources and evaluation (LREC 2008)

  • Hopcroft JE, Ullman JE (1979) Introduction to automata theory, languages, and computation. Addison-Wesley, New York

    MATH  Google Scholar 

  • Horning JJ (1969) A study of grammatical inference. PhD thesis, Stanford University, Stanford:CA, USA

  • Kasami T (1965) An efficient recognition and syntax analysis algorithm for context-free languages. Science report, Air Force Cambridge Research Laboratory, Bedford

  • Koshiba T, Makinen E, Takada Y (1997) Inferring pure context-free languages from positive data. Technical report A-1997-14, Department of Computer Science, University of Tampere

  • Langley P, Stromsten S (2000) Learning context-free grammars with a simplicity bias. In: Proceedings of the eleventh European conference on machine learning (ECML 2000), lecture notes in artificial intelligence 1810, Springer, pp 220–228

  • Levenshtein VI (1965) Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSR 163(4): 845–848 (Original in Russian)

    MathSciNet  Google Scholar 

  • MacWhinney B (1991) The CHILDES project: tools for analyzing talk. Erlbaum, Mahwah

    Google Scholar 

  • Marcus M, Santorini B, Marcinkiewicz M (1993) Building a large annotated corpus of English: the Penn treebank. Comput Linguist 19(2): 313–330

    Google Scholar 

  • McClosky D, Charniak E, Johnson M (2006) Effective self-training for parsing. In: Proceedings of HLT-NAACL 2006

  • Nakamura K (2003) Incremental learning of context free grammars by extended inductive cyk algorithm. In: Higuera C, Adriaans PW, Zaanen M, Oncina J (eds) ECML workshop on learning contex-free grammars. Ruder Boskovic Institute, Zagreb, pp 53–64

    Google Scholar 

  • Nakamura K, Matsumoto M (2002) Incremental learning of context free grammars. In: ICGI ‘02: proceedings of the 6th international colloquium on grammatical inference (London, UK), Springer, pp 174–184

  • Nakamura K, Ishiwata T (2000) Synthesizing context free grammars from sample strings based on inductive cyk algorithm. In: ICGI ‘00: proceedings of the 5th international colloquium on grammatical inference, London, UK, Springer, pp 186–195

  • Petasis G, Paliouras G, Karkaletsis V, Halatsis C, Spyropoulos CD (2004) e-GRIDS: computationally efficient grammatical inference from positive examples. GRAMMARS 7: 69–110

    Google Scholar 

  • Pullum GK (2003) Learnability. In: The Oxford International Encyclopaedia of Linguistics, 2nd edn. Oxford, Oxford University Press, pp 431–434

  • Rissanen J (1982) A universal prior for integers and estimation by minimum description length. Ann Statist 11: 416–431

    Article  MathSciNet  Google Scholar 

  • Roberts A, Atwell E (2002) Unsupervised grammar inference systems for natural language. Research report number 2002.20. School of Computing, University of Leeds

  • Sakakibara Y (1997) Recent advances of grammatical inference. Theor Comput Sci 185: 15–45

    Article  MathSciNet  MATH  Google Scholar 

  • Sakakibara Y, Brown M, Hughley R, Mian I, Sjolander K, Underwood R, Haussler D (1994) Stochastic context-free grammars for tRNA modeling. Nucleic Acids Res 22: 5112–5120

    Article  Google Scholar 

  • Sakakibara Y, Muramatsu H (2000) Learning context-free grammars from partially structured examples. In: Proceedings of the 5th international colloquium on grammatical inference: algorithms and applications (ICGI), pp 229–240

  • Salvador I, Benedı JM (2002) RNA modeling by combining stochastic context-free grammars and n-Gram models. Int J Pattern Recogn Artif Intell 16(3): 309–316

    Article  Google Scholar 

  • Seginer Y (2007) Fast unsupervised incremental parsing. In: Proceedings of the ACL 2007, Prague

  • Solan Z, Horn D, Ruppin E, Edelman S (2005) Unsupervised learning of natural languages. Proc Natl Acad Sci USA 102(33): 11629–11634

    Article  Google Scholar 

  • Steedman M, Osborne M, Sarkar A, Clark S, Hwa R, Hockenmaier J, Ruhlen P, Baker S, Crim J (2003) Bootstrapping statistical parsers from small datasets. In: Proceedings of the annual meeting of the European chapter of the ACL, Budapest, Hungary

  • van Zaanen MV (2001) Bootstrapping structure into language: alignment-based learning. PhD thesis, School of Computing, University of Leeds, UK

  • van Zaanen M, Adriaans P (2001) Alignment-based learning versus EMILE: a comparison. In: Proceedings of the Belgian-Dutch conference on artificial intelligence (BNAIC), Amsterdam, The Netherlands

  • Watkinson S, Manandhar S (2001) A psychologically plausible and computationally effective approach to learning syntax. In: Proceedings of the workshop computational natural language learning (CoNLL-2001), pp 160–167

  • Yokomori T (1995) On polynomial-time learnability in the limit of strictly deterministic automata. J Mach Learn 19: 153–179

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Patrizia Grifoni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

D’Ulizia, A., Ferri, F. & Grifoni, P. A survey of grammatical inference methods for natural language learning. Artif Intell Rev 36, 1–27 (2011).

Download citation

  • Published:

  • Issue Date:

  • DOI: