Abstract
This paper introduces a novel algorithm to learn the amplitude of non-linear activation functions (of arbitrary analytical form) in layered networks. The algorithm is based on a steepest gradient-descent technique, and relies on the inductive proof of a theorem that involves the concept of expansion function of the activation associated to a given unit of the neural net. Experimental results obtained in a speaker normalization task with a mixture of Multilayer Perceptrons show a tangible 12.64% word error rate reduction with respect to the standard Back-Propagation training.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
B. Angelini, F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter, and M. Omologo. Speaker independent continuous speech recognition using an acoustic-phonetic italian corpus. In Proc. of ICSLP, pages 1391–1394, Yokohama, 1994.
Y. Bengio. Neural Networks for Speech and Sequence Recognition, International Thomson Computer Press, London, 1996.
S. Furui and M. M. Sondhi, editors. Advances in Speech Signal Processing. Marcel Dekker and Inc., 1991.
X. D. Huang. Speaker normalization for speech recognition. In Proc. of ICASSP, pages I–465–468, San Franscisco, March 1992.
L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition,. Proc. of IEEE, 77(2):267–295, October 1989.
D.E. Rumelhart, G.E. Hinton, and R.J. Williams. Learning internal representations by error propagation. In D.E. Rumelhart and J.L. McClelland, editors, Parallel Distributed Processing, volume 1, chapter 8, pages 318–362. MIT Press, Cambridge, 1986.
E. Trentin. Networks with trainable width of activation functions. IRST Technical Report 9703-08, IRST, Povo (TN), ITALY, 1997. Submitted for publication in “Neural Networks”.
E. Trentin and D. Giuliani. Speaker adaptation with a mixture of recurrent networks. In Proc. of ESANN97, European Symposium on Artificial Neural Networks, Bruges, April 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag London Limited
About this paper
Cite this paper
Trentin, E. (1999). Learning the Amplitude of Activation Functions in Layered Networks. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets WIRN VIETRI-98. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0811-5_12
Download citation
DOI: https://doi.org/10.1007/978-1-4471-0811-5_12
Publisher Name: Springer, London
Print ISBN: 978-1-4471-1208-2
Online ISBN: 978-1-4471-0811-5
eBook Packages: Springer Book Archive