Identification in the Limit of Substitutable Context-Free Languages

  • Alexander Clark
  • Rémi Eyraud
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3734)


This paper formalisms the idea of substitutability introduced by Zellig Harris in the 1950s and makes it the basis for a learning algorithm from positive data only for a subclass of context-free grammars. We show that there is a polynomial characteristic set, and thus prove polynomial identification in the limit of this class. We discuss the relationship of this class of languages to other common classes discussed in grammatical inference. We also discuss modifications to the algorithm that produces a reduction system rather than a context-free grammar, that will be much more compact. We discuss the relationship to Angluin’s notion of reversibility for regular languages.


Transitive Closure Regular Language Reduction System Congruence Class Nonterminal Symbol 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adriaans, P., Trautwein, M., Vervoort, M.: Towards high speed grammar induction on large text corpora. In: Jeffery, K., Hlaváč, V., Wiedermann, J. (eds.) SOFSEM 2000. LNCS, vol. 1963, pp. 173–186. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  2. 2.
    Angluin, D.: Inference of reversible languages. Communications of the ACM 29, 741–765 (1982)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Book, R., Otto, F.: String rewriting systems. Springer, Heidelberg (1993)zbMATHGoogle Scholar
  4. 4.
    Chomsky, N.: Systems of syntactic analysis. Journal of Symbolic Logic 18(3) (1953)Google Scholar
  5. 5.
    de la Higuera, C.: Characteristic sets for polynomial grammatical inference. Machine Learning 27(2), 125–138 (1997)zbMATHCrossRefGoogle Scholar
  6. 6.
    Gold, E.M.: Language indentification in the limit. Information and Control 10(5), 447–474 (1967)zbMATHCrossRefGoogle Scholar
  7. 7.
    Harris, Z.: Distributional structure. Word 10(2-3), 146–162 (1954)Google Scholar
  8. 8.
    Laxminarayana, J.A., Nagaraja, G.: Inference of a subclass of context free grammars using positive samples. In: Proceedings of the Workshop on Learning Context-Free Grammars at ECML/PKDD 2003 (2003)Google Scholar
  9. 9.
    Mäkinen, E.: On inferring zero-reversible languages. Technical Report A-1998-7, University of Tampere (1998)Google Scholar
  10. 10.
    Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic probabilistic finite automata. Journal of Computer and System Sciences (JCSS) 56(2), 133–152 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Senizergues, G.: The equivalence and inclusion problems for NTS languages. J. Comput. Syst. Sci. 31(3), 303–331 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Starkie, B.: Identifying Languages in the Limit using Alignment Based Learning. PhD thesis, University of Newcastle, Australia (2004)Google Scholar
  13. 13.
    Yokomori, T.: On polynomial-time learnability in the limit of strictly deterministic automata. Machine Learning 19(2), 153–179 (1995)zbMATHGoogle Scholar
  14. 14.
    Yokomori, T.: Polynomial-time identification of very simple grammars from positive data. Theoretical Computer Science 298(1), 179–206 (2003)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Alexander Clark
    • 1
  • Rémi Eyraud
    • 2
  1. 1.Department of Computer ScienceRoyal Holloway University of LondonEghamUK
  2. 2.EURISESaint-ÉtienneFrance

Personalised recommendations