This paper presents an investigation of the balance between designing structure into a machine composed of neural components, and training to produce an architecture. The training of recurrent neural networks scales badly, largely because of the high number of connections introduced by increasing the number of nodes.
As the size of the problem increases, so the number of hidden nodes needed increases. However as the network gets larger, an increasing proportion of the connections will be redundant in the final solution. The example investigated here is using neural networks to emulate basic digital circuits. If the conventional electronic solutions were built out of neural components, most of the connections would be absent. Therefore pruning neural networks before conventional training will reduce the training complexity, and also the training time.
First the problem of training recurrent networks to behave like flip-flops is considered, and then extended to include the larger problem of binary counters. Training times and reliability are compared for each problem, using fully recurrent networks, and those in which specific connections are severed before training begins. The training characteristics are compared for those networks differing in the amount of structure imposed prior to training.