Planar Languages and Learnability

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4201)


Strings can be mapped into Hilbert spaces using feature maps such as the Parikh map. Languages can then be defined as the pre-image of hyperplanes in the feature space, rather than using grammars or automata. These are the planar languages. In this paper we show that using techniques from kernel-based learning, we can represent and efficiently learn, from positive data alone, various linguistically interesting context-sensitive languages. In particular we show that the cross-serial dependencies in Swiss German, that established the non-context-freeness of natural language, are learnable using a standard kernel. We demonstrate the polynomial-time identifiability in the limit of these classes, and discuss some language theoretic properties of these classes, and their relationship to the choice of kernel/feature map.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [ABR64]
    Aizerman, M.A., Braverman, E.M., Rozonoer, L.I.: Theoretical foundations of the potential function method in pattern recognition. Automation and Remote Control 25, 821–837 (1964)MathSciNetGoogle Scholar
  2. [Asv06]
    Asveld, P.R.J.: Generating all permutations by context-free grammars in Chomsky normal form. Theoretical Computer Science (TCS) 354(1), 118–130 (2006)zbMATHCrossRefMathSciNetGoogle Scholar
  3. [CCFWS06]
    Clark, A., Florêncio, C.C., Watkins, C.: Languages as hyperplanes: grammatical inference with string kernels. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 90–101. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. [dlH95]
    de la Higuera, C.: Characteristic sets for polynomial grammatical inference. In: Miclet, L., de la Higuera, C. (eds.) ICGI 1996. LNCS, vol. 1147. Springer, Heidelberg (1996)Google Scholar
  5. [Gol67]
    Mark Gold, E.: Language identification in the limit. Information and Control 10, 447–474 (1967)zbMATHCrossRefGoogle Scholar
  6. [Huy84]
    Huybregts, R.: The weak inadequacy of context-free phrase structure grammars. In: de Haan, G.J., Trommelen, M., Zonneveld, W. (eds.) Van Periferie naar Kern, Foris, Dordrecht (1984)Google Scholar
  7. [JS96]
    Joshi, A.K., Schabes, Y.: Tree-adjoining grammars. In: Rosenberg, G., Salomaa, A. (eds.) Handbook of Formal Languages, vol. 3, pp. 69–123. Springer, New York (1996)Google Scholar
  8. [Kon04]
    Kontorovich, L.: Learning linearly separable languages. Technical Report CMU-CALD-04-105, School of Computer Science, CMU (2004)Google Scholar
  9. [MSW91]
    Motoki, T., Shinohara, T., Wright, K.: The correct definition of finite elasticity: Corrigendum to identification of unions. In: The Fourth Workshop on Computational Learning Theory. Morgan Kaufmann, San Mateo, Calif (1991)Google Scholar
  10. [Sal05]
    Salomaa, A.: On languages defined by numerical parameters. Technical Report 663, Turku Centre for Computer Science (2005)Google Scholar
  11. [Shi85]
    Shieber, S.M.: Evidence against the context-freeness of natural language. Linguistics and Philosophy 8, 333–343 (1985)CrossRefGoogle Scholar
  12. [STC04]
    Shawe-Taylor, J., Christianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)Google Scholar
  13. [Wat99]
    Watkins, C.: Dynamic alignment kernels. Technical Report CSD-TR-98-11, Department of Computer Science, Royal Holloway College, University of London (1999)Google Scholar
  14. [Wri89]
    Wright, K.: Identification of unions of languages drawn from an identifiable class. In: The 1989 Workshop on Computational Learning Theory, pp. 328–333. Morgan Kaufmann, San Mateo (1989)Google Scholar
  15. [YK98]
    Yokomori, T., Kobayashi, S.: Learning local languages and their application to DNA sequence analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(10), 1067–1079 (1998)CrossRefGoogle Scholar
  16. [Yok91]
    Yokomori, T.: Polynomial-time learning of very simple grammars from positive data. In: Proceedings of the Fourth Annual Workshop on Computational Learning Theory, University of California, Santa Cruz, August 5–7, 1991, pp. 213–227. ACM Press, New York (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of LondonEghamUK
  2. 2.Faculté des Sciences et Techniques, Département InformatiqueSaint-EtienneFrance

Personalised recommendations