Inferring stochastic regular grammars with recurrent neural networks

  • Rafael C. Carrasco
  • Mikel L. Forcada
  • Laureano Santamaría
Session: Inference of Stochastic Models 2
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1147)


Recent work has shown that the extraction of symbolic rules improves the generalization performance of recurrent neural networks trained with complete (positive and negative) samples of regular languages. This paper explores the possibility of inferring the rules of the language when the network is trained instead with stochastic, positive-only data. For this purpose, a recurrent network with two layers is used. If instead of using the network itself, an automaton is extracted from the network after training and the transition probabilities of the extracted automaton are estimated from the sample, the relative entropy with respect to the true distribution is reduced.


Hide Neuron Relative Entropy Recurrent Neural Network Generalization Performance Regular Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Giles, C.L., Miller, C.B, Chen, D., Chen, H.H., Sun, G.Z., Lee, Y.C.: “Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks” Neural Computation 4 (1992) 393–405.Google Scholar
  2. 2.
    Giles, C.L., Miller, C.B, Chen, D., Sun, G.Z., Chen, H.H., Lee, Y.C.: “Extracting and Learning an Unknown Grammar with Recurrent Neural Networks”, in Advances in Neural Information Processing Systems 4 (J. Moody et al., eds.), Morgan-Kaufmann, San Mateo, CA, 1992, p. 317–324.Google Scholar
  3. 3.
    Watrous, R.L., Kuhn, G.M.: “Induction of Finite-State Automata Using Second-Order Recurrent Networks” Advances in Neural Information Processing Systems 4 (J. Moody et al., Eds), Morgan-Kaufmann, San Mateo, CA, 1992, p. 306–316.Google Scholar
  4. 4.
    Watrous, R.L., Kuhn, G.M.: “Induction of Finite-State Languages Using Second-Order Recurrent Networks” Neural Computation 4 (1992) 406–414.Google Scholar
  5. 5.
    Omlin, C.W., Giles, C.L.: “Extraction of Rules from Discrete-Time Recurrent Neural Networks” Neural networks 9:1 (1996) 41–52.CrossRefGoogle Scholar
  6. 6.
    Kolen, J.F.: “Fool's Gold: Extracting Finite-State Automata from Recurrent Network Dynamics” in Advances in Neural Information Processing Systems 6 (C.L. Giles, S.J. Hanson, J.D. Cowan Eds.), Morgan-Kaufmann, San Mateo, CA, 1994, 501–508.Google Scholar
  7. 7.
    Casey, M.: “The Dynamics of Discrete-Time Computation with Application to Recurrent Neural Networks and Finite State Machine Extraction” Neural Computation (1996), to appear.Google Scholar
  8. 8.
    Blair, A.D., Pollack, J.B.: “Precise Analysis of Dynamical Recognizers”. Tech. Rep. CS-95-181, Comput. Sci. Dept., Brandeis Univ. (1996).Google Scholar
  9. 9.
    Elman, J.: “Finding Structure in Time” Cognitive Science 14 (1990) 179–211.CrossRefGoogle Scholar
  10. 10.
    Castaño, M.A., Casacuberta, F.E. Vidal, E.: “Simulation of Stochastic Regular Grammars through Simple Recurrent Networks”, in New Trends in Neural Computation (J. Mira, J. Cabestany and A. Prieto, Eds.). Lecture Notes in Computer Science 686, Springer-Verlag, Berlin, 1993, p. 210–215.Google Scholar
  11. 11.
    Carrasco, R.C., Oncina, J.: “Learning Stochastic Regular Grammars by Means of a State Merging Method” in Grammatical Inference and Applications (R.C. Carrasco and J. Oncina, Eds.), Lecture Notes in Artificial Intelligence 862, Springer-Verlag, Berlin, 1994, p. 139–152.Google Scholar
  12. 12.
    Williams, R.J., Zipser, D.: “A learning algorithm for continually running fully recurrent neural networks” Neural Computation 1 (1989) 270–280.Google Scholar
  13. 13.
    Forcada, M.L., Carrasco, R.C.: “Learning the Initial State of a Second-Order Recurrent Neural Network during Regular-Language Inference” Neural Computation 7 (1995) 923–930.Google Scholar
  14. 14.
    Cover, T.M. and Thomas, J.A.: Elements of Information Theory, John Wiley and Sons, New York, 1991.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Rafael C. Carrasco
    • 1
  • Mikel L. Forcada
    • 1
  • Laureano Santamaría
    • 1
  1. 1.Departamento de Lenguajes y Sistemas InformáticosUniversidad de AlicanteAlicanteSpain

Personalised recommendations