The roles for top-down signals in visual processing have been intensively studied experimentally in the last few years and have also been modeled theoretically. In particular, predictive coding, where feedback from higher cortical areas carries expectations of lower level activity, has been shown to explain the emergence of extra-classical receptive field effects [1].

Since top-down predictions cannot be independent of the bottom-up input, the interpretation of a visual scene must be an iterative process in which the initial activation pattern relaxes to a solution matching expectation with sensory experience. However, in models, such as the one of Rao and Ballard, where top-down effects propagate over all layers of the visual hierarchy, the relaxation times are too slow compared to the time scale of visual processing. Our starting point is the mathematical observation that in predictive coding relaxation results in each higher layer essentially performing principal component analysis of the activity in the preceding lower layer. We show how this analysis can be done without propagating top-down effects through the entire visual hierarchy. In our model, not only the top-down connectivity, but also the effective resulting feedback is confined to proximal layers, yielding fast relaxation. This suggests a way for the visual system to square the need for fast processing with the integration of top-down clues.

We also make it explicit that in a biologically plausible implementation each layer requires both coding for stimulus representation and for the prediction error. This sheds new light on the interpretation of activation patterns observed in V4 which have seemed impossible to reconcile with predictive coding [2].