Abstract
We introduce a new connectionist paradigm which views neural networks as implementations of syntactic pattern recognition algorithms. Thus, learning is seen as a process of grammatical inference and recognition as a process of parsing. Naturally, the possible realizations of this theme are diverse; in this paper we present some initial explorations of the case where the pattern grammar is context-free, inferred (from examples) by a separate procedure, and then mapped onto a connectionist parser. Unlike most neural networks for which structure is pre-defined, the resulting network has as many levels as are necessary and arbitrary connections between levels. Furthermore, by the addition of a delay element, the network becomes capable of dealing with time-varying patterns in a simple and efficient manner. Since grammatical inference algorithms are notoriously expensive computationally, we place an important restriction on the type of context-free grammars which can be inferred. This dramatically reduces complexity. The resulting grammars are called ‘strictly-hierarchical’ and map straightforwardly onto a temporal connectionist parser (TCP) using a relatively small number of neurons. The new paradigm is applicable to a variety of pattern-processing tasks such as speech recognition and character recognition. We concentrate here on hand-written character recognition; performance in other problem domains will be reported in future publications. Results are presented to illustrate the performance of the system with respect to a number of parameters, namely, the inherent variability of the data, the nature of the learning (supervised or unsupervised) and the details of the clustering procedure used to limit the number of non-terminals inferred. In each of these cases (eight in total), we contrast the performance of a stochastic and a non-stochastic TCP. The stochastic TCP does have greater powers of discrimination, but in many cases the results were very similar. If this result holds in practical situations it is important, because the non-stochastic version has a straightforward implementation in silicon.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baker, J.K. (1979) Trainable grammars for speech recognition. In D. H. Klatt & J. J. Wolf (Eds) Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, pp. 547–550. New York: Algorithmics.
Bezdek, J.C. (1981) Pattern Recognition with Fuzzy Objective Function Algorithms. New York: Plenum.
Chomsky, N. (1957) Syntactic Structures. The Hague: Mouton.
Cook, C.M. (1974) Grammatical inference by heuristic search. Computer Science Center, University of Maryland Technical Report 287.
Cottrell, G.W. (1989) A Connectionist Approach to Word Sense Disambiguation. London: Pitman.
Evans, T.W. (1971) Grammatical inference techniques in pattern analysis. In J. T. Tou (Ed.) Software Engineering, Vol. 2. New York: Academic.
Fanty, M.A. (1986) Context-free parsing with connectionist networks, Proceedings of AIP Conference, No. 151, Neural Networks in Computing, Snowbird, UT, pp. 140–145.
Fu, K.S. (1982) Syntactic Pattern Recognition and Applications. Englewood Cliffs, NJ: Prentice-Hall.
Fu, K.S. & Booth, T.L. (1986) Grammatical inference: introduction and survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 343–375.
Giles, C.L., Sun, G.Z., Chen, H.H., Lee, Y.C. & Chen, D. (1990) Higher order recurrent neural networks and grammatical inference. In D. S. Touretzky (Ed.) Advances in Neural Information Processing Systems 2. San Mateo, CA: Morgan Kaufmann (to appear).
Grossberg, S. (1976) Adaptive pattern classification and universal encoding II: feedback, expectation, olfaction and illusions. Biological Cybernetics, 23, 187–202.
Howard, I.S. & Huckvale, M.A. (1989) Two-level recognition of isolated words using neural nets. In Proceedings of IEE International Conference on Artificial Neural Networks, pp. 90–94. London: IEE.
Knuth, D.E. (1968) Semantics of context-free languages. Journal of Mathematics and Systems Theory, 2, 127–146.
Kohonen, T. (1988) Self Organisation and Associative Memory, 2nd edn. Berlin: Springer-Verlag.
Lee, K-F. (1989a) Automatic Speech Recognition: the Development of the SPHINX System. Boston, MA: Kluwer.
Lee, K-F. (1989b) Hidden Markov models: past, present and future. In J. P. Tubach & J. J. Mariani (Eds) Proceedings of Eurospeech’ 89, Vol. 1, pp. 148–155. Edinburgh: CEP Consultants.
Lippmann, R.P. (1987) An introduction to computing with neural nets. IEEE Acoustics, Speech and Signal Processing Magazine, April, 4–22.
Lowe, D. (1989) Adaptive radial basis function non-linearities, and the problem of generalisation. In Proceedings of IEE International Conference on Artifical Neural Networks, pp. 171–175. London: IEE.
Lucas, S.M. & Damper, R.I. (1989a) A new learning paradigm for neural networks. In Proceedings of IEE International Conference on Artificial Neural Networks, pp. 346–350. London: IEE.
Lucas, S.M. & Damper, R.I. (1989b) Using stochastic context-free grammars for modelling and recognising cursive script. In Digest No: 1989/109, IEE Colloquium on Character Recognition, October, pp. 3/1–3/4. London: IEE.
Lucas, S.M. & Damper, R.I. (1990) Self-organising temporal networks. To appear in Proceedings of IEEE Workshop on Genetic Algorithms, Simulated Annealing and Neural Nets Applied to Problems in Signal and Image Processing. Glasgow: IEEE.
McClelland, J.J. & Elman, J.L. (1986) Interactive processes in speech perception: the TRACE model. In J.L. McClelland & D.E. Rumelhart (Eds) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 2, Psychological and Biological Models, pp. 58–121. London: MIT Press.
Tomita, M. (1985) An Efficient Parser for Natural Language. Dordrecht: Kluwer.
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K. & Lang, K.J. (1989) Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech and Signal Processing, 37, 328–339.
Waltz, D.L. & Pollack, J.B. (1985) Massively parallel parsing: a strongly interactive model of natural language interpretation. Cognitive Science, 9, 51–74.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1992 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Lucas, S.M., Damper, R.I. (1992). Syntactic Neural Networks. In: Sharkey, N. (eds) Connectionist Natural Language Processing. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-2624-3_4
Download citation
DOI: https://doi.org/10.1007/978-94-011-2624-3_4
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-5160-6
Online ISBN: 978-94-011-2624-3
eBook Packages: Springer Book Archive