Abstract
A radial basis functions neural network is trained to approximate speech spectrograms. We show that such approximations can be useful as a method of extracting known discriminatory features in speech patterns, using CV transition examples. We also argue that such an approximation to a joint time frequency representation can be seen as a description, of the dynamics of speech patterns, that does not make uniform segmentation across different frequency bands.
Preview
Unable to display preview. Download preview PDF.
References
Aizerman, Braverman & Rozonoer (1964): ‘Theoretical foundations of the method of potential functions in the problem of training automata to classify input situations'; Avtomatika i Telemekhanika, Vol. 25, No. 6.
Bourlard, H. & Kamp, Y. (1987): ‘Autoassociation by multilayer perceptrons and singular value decomposition', Manuscript #M217, Philips Research Laboratory, Brussels.
Broomhead, D. and Lowe, D. (1988): ‘Multi-variable interpolation and adaptive networks'; RSRE memo #4148; Royal Signals and Radar Establishment, Malvern.
Cybenko, G. (1989): ‘Approximation by superpositions of a signmoid function’ CSRD report #856, University of Illinois.
Elman, J.L. & Zipser, D. (1987): ‘Learning the hidden structure of speech'; Institute of Cognitive Science report #8701, University of California, San Diego.
Hanson, J.H. & Burr, D.J. (1987): ‘Knowledge representation in connectionist networks', Bell Communications Research, New Jersey.
Kawahara, H. & Irino, T. (1988): ‘Introduction to Saturated Projection Algorithm for Artificial Neural Network Design', preprint, NTT Basic Research Laboratories.
Lancaster, P. & Salkauskas, K. (1987): ‘Curve and Surface Fitting', Academic Press, London.
Lapedes and Farber (1987): ‘How neural nets work?’ Proc. Conf. Neural Information Processing Systems, Denver.
Niranjan, M. and Fallside, F. (1987): ‘On Modelling the Dynamics of Speech Patterns', European Conf. on Speech Technology, Edinburgh.
Peeling, S.M., Moore, R.K. & Tomlinson, M.J. (1986): ‘The multi-layer perceptron as a tool for speech pattern processing research', Proc. Institute of Acoustics, Vol 8: Part 7, pp 307–314.
Powell, M.J.D. (1985): ‘Radial basis functions for multi-variable interpolation: a review'; IMA Conference on algorithms for the approximation of functions and data, RMCS, Shrivenham; (Or, Report #DAMTP/NA12, Department of Applied Mathematics and Theoretical Physics, University of Cambridge).
Robinson, A.J, Niranjan, M & Fallside, F (1988): ‘Generalising the nodes of the error propagation network'; Report #CUED/F-INFENG/TR 25, Cambridge University Engineering Department; Also in Proc. International Joint Conference on Neural Networks (IJCNN), Washington, June, 1989.
Rumelhart, D.E., Hinton, G.E. & Williams, R.J. (1985): ‘Learning internal representations by error propagation', in D.E. Rumelhart & J.L.McClelland (Eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol 1: Foundations., Bradford books.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1990 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Niranjan, M., Fallside, F. (1990). Speech feature extraction using neural networks. In: Almeida, L.B., Wellekens, C.J. (eds) Neural Networks. EURASIP 1990. Lecture Notes in Computer Science, vol 412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-52255-7_40
Download citation
DOI: https://doi.org/10.1007/3-540-52255-7_40
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-52255-3
Online ISBN: 978-3-540-46939-1
eBook Packages: Springer Book Archive