Representation of finite state automata in Recurrent Radial Basis Function networks

Frasconi, Paolo; Gori, Marco; Maggini, Marco; Soda, Giovanni

doi:10.1007/BF00116897

Representation of finite state automata in Recurrent Radial Basis Function networks

Published: April 1996

Volume 23, pages 5–32, (1996)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Representation of finite state automata in Recurrent Radial Basis Function networks

Download PDF

Paolo Frasconi¹,
Marco Gori¹,
Marco Maggini¹ &
…
Giovanni Soda¹

558 Accesses
48 Citations
Explore all metrics

Abstract

In this paper, we propose some techniques for injecting finite state automata into Recurrent Radial Basis Function networks (R²BF). When providing proper hints and constraining the weight space properly, we show that these networks behave as automata. A technique is suggested for forcing the learning process to develop automata representations that is based on adding a proper penalty function to the ordinary cost. Successful experimental results are shown for inductive inference of regular grammars.

References

Abu-Mostafa, Y. S. (1990). Learning from hints in neural networks. Journal of Complexity, 6:192–198.
Google Scholar
Al-Mashouq, K. A. and Reed, I. S. (1991). Including hints in training neural nets. Neural Computation, 3(4):418.
Google Scholar
Angluin, D. and Smith, C. H. (1983). Inductive inference: Theory and methods. ACM Computing Surveys, 15(3):237–269.
Google Scholar
Baum, E. B. and Haussler, D. (1989). What size net gives valid generalization? Neural Computation, 1(1):151–160.
Google Scholar
Bengio, Y., Gori, M. and Mori, R. D. (1992). Learning the dynamic nature of speech with back-propagation for sequences. Pattern Recognition Letters, 13(5):375–385. Special issue on Artificial Neural Networks.
Google Scholar
Bengio, Y., Simard, P., and Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166. Special Issue on Dynamic Recurrent Neural Networks.
Google Scholar
Bianchini, M., Gori, M. and Maggini, M. (1994). On the problem of local minima in recurrent neural networks. IEEE Transactions on Neural Networks. 5(2):167–177. Special Issue on Dynamic Recurrent Neural Networks.
Google Scholar
Cleeremans, A., Servan-Schreiber, D. and McClelland, J. L. (1989). Finite state automata and simple recurrent networks. Neural Computation 1(3):372–381.
Google Scholar
Cover, T. M. (1965). Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers, EC-14:326–334.
Google Scholar
Das, S. and Mozer, M. C. (1994). A unified gradient-descent/clustering architecture for finite state machine induction. In Cowan, J., Tesauro, G., and Alspector, J., editors, Neural Information Processing Systems 6, pages 19–26.
Elman, J. L. (1990). Finding structure in time. Cognitive Sciences, 14:179–211.
Google Scholar
Elman, J. L. (1991). Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 7(2/3):195–226. Special issue on Connectionist Approaches to Language Learning.
Google Scholar
Fogelman-Soulie, F., Robert, Y., and Tchuente, M. (1987). Automata networks in computers science. Manchester University Press.
Frasconi, P. and Gori, M. (1993). Multilayered networks and the C-G uncertainty principle. In SPIE International Conference, Science of Artificial Neural Networks, pages 396–401, Orlando, Florida.
Frasconi, P., Gori, M., Maggini, M., and Soda, G. (1991). A unified approach for integrating explicit knowledge and learning by examples in recurrent networks. In Proceedings of IEEE-IJCNN91, volume I, pages 811–816, Seattle WA.
Frasconi, P., Gori, M., Maggini, M., and Soda, G. (1995). Unified integration of explicit rules and learning by example in recurrent networks. IEEE Transactions on Knowledge and Data Engineering, 7.
Frasconi, P., Gori, M., and Soda, G. (in press). Recurrent neural networks and prior knowledge for sequence processing: a constrained nondeterministic approach. Knowledge-based Systems.
Geman, S., Bienenstock, E., and Dourstat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1):1–58.
Google Scholar
Giles, C. L. and Maxwell, T. (1987). Learning, invariance, and generalization in high-order neural networks. Applied Optics, 26(23):4972–4978.
Google Scholar
giles, C. L., Miller, C. B., Chen, D., Chen, H.H., Sun, G.Z., and Lee, Y. C. (1992a). Extracting and learning an unknown grammar with recurrent neural networks. In Moody, J., Hanson, S. and Lippmann, R., editors, Advances in Neural Information Processing System 4, pages 317–324, San Mateo CA. Morgan Kauffman Publishers.
Google Scholar
Giles, C. L., Miller, C. B., Chen, D., Sun, G. Z., Chen, H. H., Sun, G.Z., and Lee, Y. C. (1992b). Learning and extracting finite state automata with second-order recurrent neural networks. Neural Computation, 4(3):393–405.
Google Scholar
Gold, E. M. (1967). Language identification in the limit. Information and Control, 10:447–474.
Google Scholar
Goles, E. and Martinez, S. (1990). Neural and Automata Networks. Kluwer Academic Publishers, Dordrecht, Boston, London.
Google Scholar
Gori, M., Maggini, M., and Soda, G. (1994). Scheduling of modular architectures for inductive inference of regular grammars. In Proceedings of the workshop on Combining Symbolic and Connectionist Processing, ECAI'94, pages 78–87, Amsterdam.
Gori, M. and Soda, G. (1993). Projecting sub-symbolic onto symbolic representations in artificial neural networks. In Torasso, P., editor, Lecture Notes in Artificial Intelligence, Advances in Artificial Intelligence, pages 84–89.
Gori, M. and Tesi, A. (1992). On the problem of local minima in backpropagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(1):76–86.
Google Scholar
Hopcroft, J. E. and Ullman, J. D. (1979). Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading MA.
Google Scholar
Kolen, J. F. (1994). Recurrent networks: State machines or iterated function systems? In Mozer, M. C., Smolensky, P., Touretzky, D. S., Elman, J. L., and Weigend, A. S., editors, Proceedings of the 1993 Connectionist Models Summer School, pages 203–210, Hillsdale NJ, Erlbaum.
Google Scholar
Kuhn, G., Watrous, R. L. and Ladendorf, B. (1990). Connected recognition with a recurrent network. Speech Communication, 9:41–49.
Google Scholar
le Cun, Y. (1989). Generalization and network design strategies. In Pfeifer, R., Schreter, Z., Fogelman, F., and Steels, L., editors, Connectionism in Perspective, pages 143–155, Amsterdam. Elsevier. Proceedings of the International Conference Connectionism in Perspective, University of Zürich, 10–13. October 1988.
Google Scholar
Mano, M. M. (1988). Computer Engineering, Hardware Design. Prentice-Hall.
McCulloch, W. S. and Pitts, W. (1943). A logical calculus of ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5:115–133.
Google Scholar
Miller, C. B. and Giles, C. L. (1993). Experimental comparison of the effect of order in recurrent neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 7(4):849–872. Special Issue on Applications of Neural Networks to Pattern Recognition.
Google Scholar
Minsky, M. L. and Papert, S. A. (1988). Perceptrons-Expanded Edition. MIT Press, Cambridge.
Google Scholar
Moody, J. and Darken, C. (1989). Fast learning in networks of locally-tuned processing units. Neural Computation, 1(2):281–294.
Google Scholar
Omlin, C. W. and giles, C. L. (1992). Training second-order recurrent neural networks using hints. In Sleeman D. and Edwards, P., editors, Proceedings of the Ninth International Conference on Machine Learning, pages 363–368, San Mateo CA. Morgan Kaufman Publishers.
Google Scholar
Omlin, C. W. and Giles, C. L. (1994). Constructing deterministic finite-state automata in sparse recurrent neural networks. In Proceedings of the IEEE International Conference on Neural Networks (ICNN'94), pages 1732–1737.
Perantonis, S. J. and Lisboa, P. J. G. (1992). Translation, rotation, and scale invariant pattern recognition by high-order neural networks and moment classifiers. IEEE Transactions on Neural Networks, 3(2):241–251.
Google Scholar
Pollack, J. B. (1991). The induction of dynamical recognizers. Machine Learning, 7(2/3):196–227. Special issue on Connectionist Approaches to Language Learning.
Google Scholar
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning Internal Representations by Error Propagation, volume 1: Foundations, chapter 8, pages 318–362. MIT Press, Cambridge.
Google Scholar
Servan-Schreiber, D., Cleeremans, A., and McClelland, J. L. (1991). Graded state machines: the representation of temporal contingencies in simple recurrent networks. Machine Learning, 7(2/3):161–194. Special issue on Connectionist Approaches to Language Learning.
Google Scholar
Shavlik, J. W. (1994). Combining symbolic and neural training. Machine Learning, 14(3):321–331.
Google Scholar
Sontag, E. D. and Sussman, H. J. (1989). Backpropagation separates when perceptrons do. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 639–642, Washington DC. IEEE Press.
Google Scholar
Tomita, M. (1982). Dynamic construction of finite-state automata from examples using hill-climbing. In Proceedings of the Fourth Annual Cognitive Science Conference, pages 105–108, Ann Arbor MI.
Towell, G. G. and Shavlik, J. W. (1993). The extraction of refined rules from knowledge-based neural networks. Machine Learning, 13(1):71–101.
Google Scholar
Towell, G. G., Shavlik, J. W., and Noordewier, M. O. (1990). Refinement of approximate domain theories by knowledge-based neural networks. In Proceedings of the Eighth National Conference on Artificial Intelligence (AAAI-90), pages 861–866, Boston MA.
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., and Lang, K. (1989). Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(3):328–339.
Google Scholar
Watrous, R. L. and Kuhn, G. M. (1992). Induction of finite-state languages using second-order recurrent networks. Neural Computation, 4(3):406–414.
Google Scholar
Watrous, R. L., Towell, G. G., Glassman, M. S., Shahraray, M., and Theivanayagam, D. (1993). Synthesize, optimize, analyze, repeat (SOAR): Application of neural network tools to ECG patient monitoring. In Proceedings of the 1993 International Symposium on Nonlinear Theory and Its Applications, Honolulu.
Williams, R. J. and Peng, J. (1990). An efficient gradient-based algorithm for on-line training of recurrent network trajectories. Neural Computation. 2(4):490–501.
Google Scholar
Williams, R. J. and Zipser, D. (1989). A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1(2):270–280.
Google Scholar
Yu, X. H. (1992). Can backpropagation error surface not have local minima? IEEE Transactions on Neural Networks, 3(6):1019–1020.
Google Scholar
Zeng, Z., Goodman, R., and Smyth, P. (1993). Learning finite state machines with self-clustering recurrent networks. Neural Computation, 5(6):976–990.
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Sistemi e Informatica, Università di Firenze, Via di Santa Marta 3, 50139, Firenze, Italy
Paolo Frasconi, Marco Gori, Marco Maggini & Giovanni Soda

Authors

Paolo Frasconi
View author publications
You can also search for this author in PubMed Google Scholar
Marco Gori
View author publications
You can also search for this author in PubMed Google Scholar
Marco Maggini
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Soda
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Frasconi, P., Gori, M., Maggini, M. et al. Representation of finite state automata in Recurrent Radial Basis Function networks. Mach Learn 23, 5–32 (1996). https://doi.org/10.1007/BF00116897

Download citation

Received: 18 November 1993
Accepted: 23 January 1995
Issue Date: April 1996
DOI: https://doi.org/10.1007/BF00116897

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Representation of finite state automata in Recurrent Radial Basis Function networks

Abstract

Article PDF

Similar content being viewed by others

Gradient-Based Learning of Finite Automata

Extracting automata from recurrent neural networks using queries and counterexamples (extended version)

On the Interpretation of Recurrent Neural Networks as Finite State Machines

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Representation of finite state automata in Recurrent Radial Basis Function networks

Abstract

Article PDF

Similar content being viewed by others

Gradient-Based Learning of Finite Automata

Extracting automata from recurrent neural networks using queries and counterexamples (extended version)

On the Interpretation of Recurrent Neural Networks as Finite State Machines

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation