Learning the Amplitude of Activation Functions in Layered Networks

Trentin, Edmondo

doi:10.1007/978-1-4471-0811-5_12

Edmondo Trentin⁴

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

129 Accesses

Abstract

This paper introduces a novel algorithm to learn the amplitude of non-linear activation functions (of arbitrary analytical form) in layered networks. The algorithm is based on a steepest gradient-descent technique, and relies on the inductive proof of a theorem that involves the concept of expansion function of the activation associated to a given unit of the neural net. Experimental results obtained in a speaker normalization task with a mixture of Multilayer Perceptrons show a tangible 12.64% word error rate reduction with respect to the standard Back-Propagation training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

B. Angelini, F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter, and M. Omologo. Speaker independent continuous speech recognition using an acoustic-phonetic italian corpus. In Proc. of ICSLP, pages 1391–1394, Yokohama, 1994.
Google Scholar
Y. Bengio. Neural Networks for Speech and Sequence Recognition, International Thomson Computer Press, London, 1996.
Google Scholar
S. Furui and M. M. Sondhi, editors. Advances in Speech Signal Processing. Marcel Dekker and Inc., 1991.
Google Scholar
X. D. Huang. Speaker normalization for speech recognition. In Proc. of ICASSP, pages I–465–468, San Franscisco, March 1992.
Google Scholar
L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition,. Proc. of IEEE, 77(2):267–295, October 1989.
Article Google Scholar
D.E. Rumelhart, G.E. Hinton, and R.J. Williams. Learning internal representations by error propagation. In D.E. Rumelhart and J.L. McClelland, editors, Parallel Distributed Processing, volume 1, chapter 8, pages 318–362. MIT Press, Cambridge, 1986.
Google Scholar
E. Trentin. Networks with trainable width of activation functions. IRST Technical Report 9703-08, IRST, Povo (TN), ITALY, 1997. Submitted for publication in “Neural Networks”.
Google Scholar
E. Trentin and D. Giuliani. Speaker adaptation with a mixture of recurrent networks. In Proc. of ESANN97, European Symposium on Artificial Neural Networks, Bruges, April 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

ITC-Irst (Istituto per la Ricerca Scientifica e Tecnologica), I-38050, Povo (Trento), Italy
Edmondo Trentin

Authors

Edmondo Trentin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Scienze Fisiche “E.R. Caianiello”, Università di Salerno, 84081, Baronissi (SA), Italy
Maria Marinaro
Dipartimento di Informatica ed Applicazioni “R.M. Capocelli”, Università di Salerno, 84081, Baronissi (SA), Italy
Roberto Tagliaferri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Trentin, E. (1999). Learning the Amplitude of Activation Functions in Layered Networks. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets WIRN VIETRI-98. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0811-5_12

Download citation

DOI: https://doi.org/10.1007/978-1-4471-0811-5_12
Publisher Name: Springer, London
Print ISBN: 978-1-4471-1208-2
Online ISBN: 978-1-4471-0811-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics