Abstract
Signal processing in the cerebral cortex is thought to involve a common multi-purpose algorithm embodied in a canonical cortical micro-circuit that is replicated many times over both within and across cortical regions. Operation of this algorithm produces widely distributed but coherent and relevant patterns of activity. The theory of Coherent Infomax provides a formal specification of the objectives of such an algorithm. It also formally derives specifications for both the short-term processing dynamics and for the learning rules whereby the connection strengths between units in the network can be adapted to the environment in which the system finds itself. A central assumption of the theory is that the local processors can combine reliable signal coding with flexible use of those codes because they have two classes of synaptic connection: driving connections which specify the information content of the neural signals, and contextual connections which modulate that signal processing. Here, we make the biological relevance of this theory more explicit by putting more emphasis upon the contextual guidance of ongoing processing, by showing that Coherent Infomax is consistent with a particular Bayesian interpretation for the contextual guidance of learning and processing, by explicitly specifying rules for on-line learning, and by suggesting approximations by which the learning rules can be made computationally feasible within systems composed of very many local processors.
Similar content being viewed by others
References
Aitchison, J., & Kay, J. W. (1975). Principles, practice and performance in decision making in clinical medicine. In D. J. White & K. C. Bowen (Eds.), The role and effectiveness of theories of decision in practice (pp. 252–272). London: Hodder & Stoughton.
Artola, A., Brocher, S., & Singer, W. (1990). Different voltage-dependent thresholds for the induction of long-term depression and long-term potentiation in slices of rat visual cortex. Nature, 347, 69–72.
Atick, J. J. (1992). Could information theory provide an ecological theory of sensory processing? Netw., Comput. Neural Syst., 3, 213–251.
Attneave, F. (1959). Applications of information theory to psychology. New York: Holt, Rinehart & Winston.
Becker, S. (1992). An information-theoretic unsupervised learning algorithm for neural networks. Ph.D. Thesis, University of Toronto.
Becker, S. (1993). Learning to categorise objects using temporal coherence. In S. J. Hanson, J. D. Cowan & C. L. Giles (Eds.), Advances in neural information processing systems (Vol. 5, pp. 361–368). San Mateo: Morgan Kaufmann.
Becker, S. (1995). JPMAX: learning to recognise moving objects as a model-fitting problem. In G. Tesauro, D. S. Touretzky & T. K. Leen (Eds.), Advances in neural information processing systems (Vol. 7, pp. 933–940). Cambridge: MIT Press.
Becker, S. (1996). Mutual information maximization: models of cortical self-organization. Netw., Comput. Neural Syst., 7, 7–31.
Becker, S., & Hinton, G. E. (1992). Self-organizing neural network that discovers surfaces in random-dot stereograms. Nature, 355, 161–163.
Becker, S., & Hinton, G. E. (1995). Spatial coherence as an internal teacher for a neural network. In Y. Chauvin & D. Rumelhart (Eds.), Backpropagation: theory, architectures and applications (pp. 313–349). Hillsdale: Erlbaum.
Bell, A. J., & Sejnowski, T. J. (1995). An information maximisation approach to blind separation and blind deconvolution. Neural Comput., 7, 1129–1159.
Chechik, G., Globerson, A., Tishby, N., & Weiss, Y. (2005). Information bottleneck for Gaussian variables. J. Mach. Learn. Res., 6, 165–188.
Creutzig, F., & Sprekeler, H. (2008). Predictive coding and the slowness principle: an information-theoretic approach. Neural Comput., 20, 1026–1041.
DeWeese, M. (1996). Optimization principles for the neural code. Netw., Comput. Neural Syst., 7, 325–331.
Doya, K., Ishii, S., Pouget, A., & Rao, R. P. N. (Eds.) (2007). Bayesian brain: probabilistic approaches to neural coding. Cambridge: MIT Press.
Finger, S. (1994). Origins of neuroscience. New York: Oxford University Press.
Friston, K. (2003). Learning and inference in the brain. Neural Netw., 16, 1325–1352.
Friston, K. J. (2010). The free-energy principle: a unified brain theory? Nat. Rev. Neurosci., 11, 127–138.
Gokhale, D. V., & Kullback, S. (1978). The information in contingency tables. New York: Dekker.
Hamming, R. W. (1980). Coding and information theory. Englewood Cliffs: Prentice-Hall.
Holden, J. G., Van Orden, G. C., & Turvey, M. T. (2009). Dispersal of response times reveals cognitive dynamics. Psychol. Rev., 116, 318–342.
Intrator, N., & Cooper, L. N. (1995). Information theory of visual plasticity. In M. A. Arbib (Ed.), The handbook of brain theory and neural networks (pp. 484–487). Boston: MIT Press.
Kay, J. (2000). Neural networks for unsupervised learning based on information theory. In J. W. Kay & D. M. Titterington (Eds.), Statistics and neural networks: advances at the interface (pp. 25–63). Oxford: Oxford University Press.
Kay, J., Floreano, D., & Phillips, W. A. (1998). Contextually guided unsupervised learning using local multivariate binary processors. Neural Netw., 11, 117–140.
Kay, J., & Phillips, W. A. (1994). Activation functions, computational goals and learning rules for local processors with contextual guidance (Technical Report CCCN-15). Centre for Cognitive and Computational Science, University of Stirling.
Kay, J., & Phillips, W. A. (1997). Activation functions, computational goals and learning rules for local processors with contextual guidance. Neural Comput., 9, 895–910.
Kello, C. T., Beltz, B. C., Holden, J. G., & Van Orden, G. C. (2007). The emergent coordination of cognitive function. J. Exp. Psychol. Gen., 136, 551–568.
Körding, K. P., & König, P. (2000). Learning with two sites of synaptic integration. Netw., Comput. Neural Syst., 11, 1–15.
Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427, 244–247.
Kullback, S. (1959). Information theory and statistics. New York: Wiley.
Lamme, V. A. F., & Roelfsema, P. R. (2000). The distinct modes of vision offered by feedforward and recurrent processing. Trends Neurosci., 23, 571–579.
Lee, T. S., & Mumford, D. (2003). Hierarchical Bayesian inference in the visual cortex. J. Opt. Soc. Am., 20(7), 1434–1448.
Lewis, D. A., Hashimoto, T., & Volk, D. W. (2005). Cortical inhibitory neurons and schizophrenia. Nat. Rev. Neurosci., 6, 312–324.
Lindley, D. V. (1956). On a measure of information provided by an experiment. Ann. Math. Stat., 27, 986–1005.
Linsker, R. (1988). Self-organization in a perceptual network. Computer, 21, 105–117.
Linsker, R. (1992). Local synaptic learning rules suffice to maximize mutual information in a linear network. Neural Comput., 4, 691–702.
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys., 5, 115–133.
McGill, W.J. (1954). Multivariate information transmission. Psychometrika, 19, 97–116.
Optican, L. M., Gawne, T. J., Richmond, B. J., & Joseph, P. J. (1991). Unbiased measures of transmitted information and channel capacity from multivariate neuronal data. Biol. Cybern., 65, 305–310.
Phillips, W. A., & Craven, B. (2000). Interactions between coincident and orthogonal cues to texture boundaries. Percept. Psychophys., 62, 1019–1038.
Phillips, W. A., Kay, J., & Smyth, D. (1995). The discovery of structure by multi-stream networks of local processors with contextual guidance. Netw., Comput. Neural Syst., 6, 225–246.
Phillips, W. A., & Silverstein, S. M. (2003). Convergence of biological and psychological perspectives on cognitive coordination in schizophrenia. Behav. Brain Sci., 26, 65–138.
Phillips, W. A., & Singer, W. (1997). In search of common foundations for cortical computation. Behav. Brain Sci., 20, 657–722.
Redlich, A. N. (1993). Redundancy reduction as a strategy for unsupervised learning. Neural Comput., 5, 289–304.
Reike, F., Warland, D., de Ruyter van Steninck, R., & Bialek, W. (1997). Spikes. Cambridge: MIT Press.
Roopun, A. K., Cunningham, M. O., Racca, C., Alter, K., Traub, R. D., & Whittington, M. A. (2008). Region-specific changes in gamma and beta2 rhythms in NMDA receptor dysfunction models of schizophrenia. Schizophr. Bull., 34, 962–973.
Salinas, E., & Sejnowski, T. J. (2001). Gain modulation in the central nervous system: where behavior, neurophysiology, and computation meet. Neuroscientist, 7, 430–440.
Sanger, T. D. (1997). A probability interpretation of neural population coding for movement. In P. Morasso & V. Sanguineti (Eds.), Self-organisation, computational maps and motor control (pp. 75–116). Amsterdam: Elsevier.
Schwartz, O., Hsu, A., & Dayan, P. (2007). Space and time in visual context. Nat. Rev. Neurosci., 8, 522–535.
Shannon, C. E., & Weaver, W. (1949). The mathematical theory of communication. Chicago: University of Illinois Press.
Sherman, S. M., & Guillery, R. W. (1998). On the actions that one nerve cell can have on another: distinguishing ‘drivers’ from ‘modulators’. Proc. Natl. Acad. Sci. USA, 95, 7121–7126.
Smyth, D., Phillips, W. A., & Kay, J. (1996). Measures for investigating the contextual modulation of information transmission. Netw., Comput. Neural Syst., 7, 307–316.
Spratling, M. W. (2008). Predictive-coding as a model of biased competition in visual attention. Vis. Res., 48, 1391–1408.
Spratling, M. W., & Johnson, M. H. (2006). A feedback model of perceptual learning and categorization. Vis. Cogn., 13, 129–165.
Taylor, J. G., & Plumbley, M. D. (1993). Information theory and neural networks. In J. G. Taylor (Ed.), Mathematical approaches to neural networks (pp. 307–340). Elsevier: North Holland.
Tiesinga, P., Fellous, J.-M., Salinas, E., Jose, J., & Sejnowski, T. (2005). Inhibitory synchrony as a mechanism for attentional gain modulation. J. Physiol., 98, 296–314 (Paris).
Tononi, G., Sporns, O., & Edelman, G. M. (1994). A measure for brain complexity: relating functional segregation and integration in the nervous system. Proc. Natl. Acad. Sci. USA, 91, 5033–5037.
Treves, A., & Panzeri, S. (1995). The upward bias in measures of information derived from limited data samples. Neural Comput., 7, 399–407.
Tsukada, M., Ishii, N., & Sato, R. (1975). Temporal pattern discrimination of impulse sequences on the computer-simulated nerve cells. Biol. Cybern., 17, 19–28.
Tsukada, M., Ishii, N., & Sato, R. (1976). Stochastic automaton models for the temporal pattern discrimination of nerve impulse sequences. Biol. Cybern., 21, 121–130.
Tsukada, M., Terasawa, M., & Hauske, G. (1983). Temporal pattern discrimination in the cat’s retinal cells and Markov system models. IEEE Trans. Syst. Man Cybern., 13, 953–964.
von der Malsburg, C., Phillips, W. A., & Singer, W. (Eds.) (2010). Strüngmann forum report: Vol. 5. Dynamic coordination in the brain: from neurons to mind. Cambridge: MIT Press.
Whittaker, J. (1990). Graphical models in applied statistics. Chichester: Wiley.
Whittington, M. A., & Traub, R. D. (2003). Interneuron diversity series: inhibitory interneurons and network oscillations in vitro. Trends Neurosci., 26, 676–682.
Wright, J. J., Robinson, P. A., Rennie, C. J., Gordon, E., Bourke, P. D., Chapman, C. L., Hawthorn, N., Lees, G. J., & Alexander, D. (2001). Toward an integrated continuum model of cerebral dynamics: the cerebral rhythms, synchronous oscillation and cortical stability. Biosystems, 63, 71–88.
Zador, A. (1998). Impact of synaptic unreliability on the information transmitted by spiking neurons. J. Neurophysiol., 79, 1219–1229.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kay, J.W., Phillips, W.A. Coherent Infomax as a Computational Goal for Neural Systems. Bull Math Biol 73, 344–372 (2011). https://doi.org/10.1007/s11538-010-9564-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-010-9564-x