Skip to main content

Learning the Amplitude of Activation Functions in Layered Networks

  • Conference paper
Neural Nets WIRN VIETRI-98

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

  • 129 Accesses

Abstract

This paper introduces a novel algorithm to learn the amplitude of non-linear activation functions (of arbitrary analytical form) in layered networks. The algorithm is based on a steepest gradient-descent technique, and relies on the inductive proof of a theorem that involves the concept of expansion function of the activation associated to a given unit of the neural net. Experimental results obtained in a speaker normalization task with a mixture of Multilayer Perceptrons show a tangible 12.64% word error rate reduction with respect to the standard Back-Propagation training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. Angelini, F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter, and M. Omologo. Speaker independent continuous speech recognition using an acoustic-phonetic italian corpus. In Proc. of ICSLP, pages 1391–1394, Yokohama, 1994.

    Google Scholar 

  2. Y. Bengio. Neural Networks for Speech and Sequence Recognition, International Thomson Computer Press, London, 1996.

    Google Scholar 

  3. S. Furui and M. M. Sondhi, editors. Advances in Speech Signal Processing. Marcel Dekker and Inc., 1991.

    Google Scholar 

  4. X. D. Huang. Speaker normalization for speech recognition. In Proc. of ICASSP, pages I–465–468, San Franscisco, March 1992.

    Google Scholar 

  5. L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition,. Proc. of IEEE, 77(2):267–295, October 1989.

    Article  Google Scholar 

  6. D.E. Rumelhart, G.E. Hinton, and R.J. Williams. Learning internal representations by error propagation. In D.E. Rumelhart and J.L. McClelland, editors, Parallel Distributed Processing, volume 1, chapter 8, pages 318–362. MIT Press, Cambridge, 1986.

    Google Scholar 

  7. E. Trentin. Networks with trainable width of activation functions. IRST Technical Report 9703-08, IRST, Povo (TN), ITALY, 1997. Submitted for publication in “Neural Networks”.

    Google Scholar 

  8. E. Trentin and D. Giuliani. Speaker adaptation with a mixture of recurrent networks. In Proc. of ESANN97, European Symposium on Artificial Neural Networks, Bruges, April 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag London Limited

About this paper

Cite this paper

Trentin, E. (1999). Learning the Amplitude of Activation Functions in Layered Networks. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets WIRN VIETRI-98. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0811-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-0811-5_12

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-1208-2

  • Online ISBN: 978-1-4471-0811-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics