Learning stochastic regular grammars by means of a state merging method

  • Rafael C. Carrasco
  • Jose Oncina
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 862)


We propose a new algorithm which allows for the identification of any stochastic deterministic regular language as well as the determination of the probabilities of the strings in the language. The algorithm builds the prefix tree acceptor from the sample set and merges systematically equivalent states. Experimentally, it proves very fast and the time needed grows only linearly with the size of the sample set.


Destination Node Recurrent Neural Network Lexicographic Order Regular Language Outgoing Transition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    E.M. Gold: Complexity of Automaton Identification from Given Data. Information and Control 37 (1978) 302–320.CrossRefGoogle Scholar
  2. [2]
    D. Angluin: Identifying Languages from Stochastic Examples. Internal Report YALEU /DCS /RR-614 (1988).Google Scholar
  3. [3]
    F.J. Maryanski and T.L. Booth: Inference of Finite-State Probabilistic Grammars. IEEE Transactions on Computers C26 (1977) 521–536.Google Scholar
  4. [4]
    A. van der Mude and A. Walker: On the Inference of Stochastic Regular Grammars. Information and Control 38 (1978) 310–329.CrossRefGoogle Scholar
  5. [5]
    A.W. Smith and D. Zipser: Learning Sequential Structure with the Real-Time Recurrent Learning Algorithm. International Journal of Neural Systems 1 (1989) 125–131.CrossRefGoogle Scholar
  6. [6]
    J.B. Pollack: The Induction of Dynamical Recognizers. Machine Learning 7 (1991) 227–252.Google Scholar
  7. [7]
    C.L. Giles: Learning and Extracting Finite State Automata with Second Order Recurrent Neural Networks. Neural Computation 4 (1992) 393–405.Google Scholar
  8. [8]
    R.L. Wartous and G.M. Kuhn: Induction of Finite-state Languages Using Second-Order Recurrent Networks. Neural Computation 4 (1992) 406–414.Google Scholar
  9. [9]
    M.A. Castaño, F. Casacuberta, E. Vidal: Simulation of Stochastic Regular Grammars through Simple Recurrent Networks, in: New Trends in Neural Computation (Eds. J. Mira, J. Cabestany and A. Prieto). Springer Verlag, Lecture Notes in Computer Science 686 (1993) 210–215.Google Scholar
  10. [10]
    A. Stolcke and S. Omohundro: Hidden Markov Model Induction by Bayesian Model Merging. To appear in: Advances in Neural Information Processing Systems 5 (C.L. Giles, S.J. Hanson and J.D. Cowan eds.) Morgan Kaufman, Menlo Park, California (1993).Google Scholar
  11. [11]
    J. Oncina and P. García: Inferring Regular Languages in Polynomial Time, in: Pattern Recognition and Image Analysis (N. Pérez de la Blanca, A. Sanfeliu and E. Vidal eds.) World Scientific (1992).Google Scholar
  12. [12]
    K.S. Fu: Syntactic Pattern Recognition and Applications. Prentice Hall, Englewood Cliffs, N.J. (1982).Google Scholar
  13. [13]
    J.E. Hopcroft and J.D. Ullman: Introduction to Automata Theory, Languages and Computation. Addison Wesley, Reading, Massachusetts (1979).Google Scholar
  14. [14]
    W. Hoeffding: Probability inequalities for sums of bounded random variables. American Statistical Association Journal 58 (1963) 13–30.Google Scholar
  15. [15]
    W. Feller: An introduction to probability theory and its applications. John Wiley and Sons, New York (1950)Google Scholar
  16. [16]
    A.S. Reber: Implicit Learning of Artificial Grammars. Journal of Verbal Learning and Verbal Behaviour 6 (1967) 855–863.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1994

Authors and Affiliations

  • Rafael C. Carrasco
    • 1
  • Jose Oncina
    • 1
  1. 1.Departamento de Tecnología Informática y ComputaciónUniversidad de AlicanteAlicante

Personalised recommendations