Learning Rational Stochastic Languages

  • François Denis
  • Yann Esposito
  • Amaury Habrard
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4005)

Abstract

Given a finite set of words w1, ..., wn independently drawn according to a fixed unknown distribution law P called a stochastic language, a usual goal in Grammatical Inference is to infer an estimate of P in some class of probabilistic models, such as Probabilistic Automata (PA). Here, we study the class \({{\mathcal S}_{\mathbb R}^{rat}(\Sigma)}\) of rational stochastic languages, which consists in stochastic languages that can be generated by Multiplicity Automata (MA) and which strictly includes the class of stochastic languages generated by PA. Rational stochastic languages have minimal normal representation which may be very concise, and whose parameters can be efficiently estimated from stochastic samples. We design an efficient inference algorithm DEES which aims at building a minimal normal representation of the target. Despite the fact that no recursively enumerable class of MA computes exactly \({{\mathcal S}_{\mathbb Q}^{rat}(\Sigma)}\), we show that DEES strongly identifies \({{\mathcal S}_{\mathbb Q}^{rat}(\Sigma)}\) in the limit. We study the intermediary MA output by DEES and show that they compute rational series which converge absolutely and which can be used to provide stochastic languages which closely estimate the target.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Beimel, A., Bergadano, F., Bshouty, N.H., Kushilevitz, E., Varricchio, S.: On the applications of multiplicity automata in learning. In: IEEE Symposium on Foundations of Computer Science, pp. 349–358 (1996)Google Scholar
  2. 2.
    Beimel, A., Bergadano, F., Bshouty, N.H., Kushilevitz, E., Varricchio, S.: Learning functions represented as multiplicity automata. Journal of the ACM 47(3), 506–530 (2000)CrossRefMathSciNetMATHGoogle Scholar
  3. 3.
    Bergadano, F., Varricchio, S.: Learning behaviors of automata from multiplicity and equivalence queries. In: Italian Conf. on Algorithms and Complexity (1994)Google Scholar
  4. 4.
    Berstel, J., Reutenauer, C.: Les séries rationnelles et leurs langages. Masson (1984)Google Scholar
  5. 5.
    Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS, vol. 862, pp. 139–152. Springer, Heidelberg (1994)Google Scholar
  6. 6.
    Denis, F., Esposito, Y.: Learning classes of probabilistic automata. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 124–139. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Denis, F., Esposito, Y.: Rational stochastic languages. Technical report, LIF - Université de Provence (2006), http://hal.ccsd.cnrs.fr/ccsd-00019728
  8. 8.
    Gold, E.M.: Language identification in the limit. Inform. Control 10, 447–474 (1967)CrossRefMATHGoogle Scholar
  9. 9.
    Hardy, G.H., Wright, E.M.: An introduction to the theory of numbers. Oxford University Press, Oxford (1979)MATHGoogle Scholar
  10. 10.
    Lugosi, G.: Pattern classification and learning theory. In: Principles of Nonparametric Learning, pp. 1–56. Springer, Heidelberg (2002)Google Scholar
  11. 11.
    Sakarovitch, J.: Éléments de théorie des automates. Éditions Vuibert (2003)Google Scholar
  12. 12.
    Salomaa, A., Soittola, M.: Automata: Theoretic Aspects of Formal Power Series. Springer, Heidelberg (1978)MATHGoogle Scholar
  13. 13.
    Thollard, F., Dupont, P., de la Higuera, C.: Probabilistic DFA inference using Kullback-Leibler divergence and minimality. In: Proc. 17th ICML, pp. 975–982. KAUFMGoogle Scholar
  14. 14.
    Vapnik, V.N.: Statistical Learning Theory. John Wiley, Chichester (1998)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • François Denis
    • 1
  • Yann Esposito
    • 1
  • Amaury Habrard
    • 1
  1. 1.Laboratoire d’Informatique Fondamentale de Marseille (L.I.F.) UMR CNRS 6166 

Personalised recommendations