Statistical physics and practical training of soft-committee machines

Ahr, M.; Biehl, M.; Urbanczik, R.

doi:10.1007/s100510050889

M. Ahr¹,
M. Biehl¹ &
R. Urbanczik²

99 Accesses
11 Citations
1 Altmetric
Explore all metrics

Abstract:

Equilibrium states of large layered neural networks with differentiable activation function and a single, linear output unit are investigated using the replica formalism. The quenched free energy of a student network with a very large number of hidden units learning a rule of perfectly matching complexity is calculated analytically. The system undergoes a first order phase transition from unspecialized to specialized student configurations at a critical size of the training set. Computer simulations of learning by stochastic gradient descent from a fixed training set demonstrate that the equilibrium results describe quantitatively the plateau states which occur in practical training procedures at sufficiently small but finite learning rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Author information

Authors and Affiliations

Institut für Theoretische Physik, Julius-Maximilians-Universität Würzburg, Am Hubland, 97074, Würzburg, Germany
M. Ahr & M. Biehl
Neural Computing Research Group,Aston University, Aston Triangle, B4 7ET, Birmingham, UK
R. Urbanczik

Authors

M. Ahr
View author publications
You can also search for this author in PubMed Google Scholar
M. Biehl
View author publications
You can also search for this author in PubMed Google Scholar
R. Urbanczik
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Received 16 December 1998

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahr, M., Biehl, M. & Urbanczik, R. Statistical physics and practical training of soft-committee machines. Eur. Phys. J. B 10, 583–588 (1999). https://doi.org/10.1007/s100510050889

Download citation

Issue Date: June 1999
DOI: https://doi.org/10.1007/s100510050889

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical physics and practical training of soft-committee machines

Abstract:

Access this article

Similar content being viewed by others

The Statistical Physics of Learning Revisited: Typical Learning Curves in Model Scenarios

A statistical mechanics framework for Bayesian deep neural networks beyond the infinite-width limit

Improving the Convergence Property of Soft Committee Machines by Replacing Derivative with Truncated Gaussian Function

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Statistical physics and practical training of soft-committee machines

Abstract:

Access this article

Similar content being viewed by others

The Statistical Physics of Learning Revisited: Typical Learning Curves in Model Scenarios

A statistical mechanics framework for Bayesian deep neural networks beyond the infinite-width limit

Improving the Convergence Property of Soft Committee Machines by Replacing Derivative with Truncated Gaussian Function

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation