Abstract
This article is an overview of a programme of research based on the conjecture thatall kinds of computing and formal reasoning may usefully be understood as information compression by pattern matching, unification and metrics-guided search.
The research aims to develop this idea into a theory of computing to integrate and simplify diverse concepts in the field. The research also aims to develop a ‘new generation’ computing system, based on the theory, to integrate and simplify diverse kinds of computing and to achieve more flexibility and ‘intelligence’ than conventional computers. Software simulations of the proposed new system provide a concrete expression of the developing theory and a test-bed for the ideas.
The background to the research is briefly reviewed including evidence that information compression is a significant element in biological information processing systems.
Concepts ofinformation andredundancy are described as a basis for describing how information compression may be achieved by the comparison ormatching of patterns, the merging orunification of patterns which are the same, together withmetrics-guided search (e.g., ‘hill climbing’, ‘beam search’) to maximise compression for a given computational effort.
The main elements of the SP theory and of the proposed SP system are described with a summary of developments to date.
Some of the kinds of computing which be interpreted as information compression are briefly reviewed. These include: the ‘low level’ workings of conventional computers; information retrieval, pattern recognition and de-referencing of identifiers; unsupervised inductive learning (grammatical inference, data mining, automatic organisation of software and of knowledge bases); the execution of mathematical or computing functions; deductive and probabilistic inference; parsing and natural language processing; planning and problem solving.
Areas of uncertainty where further work is needed are indicated at appropriate points throughout the article.
Similar content being viewed by others
References
Aamodt, A. and Plaza, E., “Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches,”AI Communications, 7, pp. 39–59, 1994.
Atick, J. J. and Redlich, A. N., “Towards a Theory of Early Visual Processing,”Neural Computation, 2, pp. 308–320, 1990.
Attneave, F., “Informational Aspects of Visual Perception,”Psychological Review, 61, pp. 183–193, 1954.
Barlow, H. B., “Possible Principles Underlying the Transformations of Sensory Messages,” inSensory Communication (W. A. Rosenblith, ed.), Cambridge Mass.: MIT Press, pp. 217–234, 1961.
Barlow, H. B., “Trigger Features, Adaptation and Economy of Impulses,” inInformation Processing in the Nervous System (K. N. Leibovic, ed.), New York: Springer, pp. 209–230, 1969.
Barlow, H. B., “Single Units and Sensation: A neuron Doctrine for Perceptual Psychology,”Perception, 1, pp. 371–394, 1972.
Barlow, H. B. and Földiák, P., “Adaptation and Decorrelation in the Cortex” inThe Computing Neuron (R. M. Durbin, C. Miall, and G. J. Mitchison, eds.), Chapter 4, Wokingham: Addison-Wesley, pp. 54–72, 1989.
Barlow, H. B., Kaushal, T. P., and Mitchison, G. J., “Finding Minimum Entropy Codes,”Neural Computation, 1, pp. 412–423, 1989.
Becker, K.-H. and Dorfler, M.,Dynamical Systems and Fractals, Cambridge: Cambridge University Press, 1989.
Chaitin, G. J.,Algorithmic Information Theory, Cambridge: Cambridge University Press, 1987.
Cheeseman, P., “On Finding the Most Probable Model,” inComputational Models of Scientific Discovery and Theory Formation (J. Strager and P. Langley, eds.), San Mateo, Ca.: Morgan Kaufmann, pp. 73–95, 1990.
Collins, A. M. and Quillian, M. R., “Experiments on Semantic Memory and Language Comprehension,” inCognition in Learning and Memory (L. W. Gregg, ed.), New York: Wiley, pp. 117–147, 1972.
Cook, C. M. and Rosenfeld, A., “Some Experiments in Grammatical Inference,” inComputer Oriented Learning Processes (J. C. Simon, ed.), Leyden: Noordhoff, pp. 157–174, 1976.
Cottrell, G. W., Munro, P., and Zipser, D., “Image Compression by Back Propagation: An Example of Extensional Programming,” inModels of Cognition: A Review of Cognitive Science (N. E. Sharkey, ed.), pp. 209–238, 1989.
Enderle, G., Kansy, K., and Pfaff, G.,Computer Graphics Programming, Berlin: Springer-Verlag, 1987.
Földiák, P., “Forming Sparse Representations by Local Anti-Hebbian Learning,”Biological Cybernetics, 64, pp. 165–170, 1990.
Forsyth, R. S., “Ockham’s Razor as a Gardening Tool: Simplifying Discrimination Trees by Entropy Min-Max,” inResearch and Development in Expert Systems, X (M. A. Bramer and R. W. Milne, eds.), Cambridge: Cambridge University Press, pp. 183–195, 1992.
Fries, C. C.,The Structure of English, New York: Harcourt, Brace & World, 1952.
Gammerman, A., “The Representation and Manipulation of the Algorithmic Probability Measure for Problem Solving,”Annals of Mathematics and Artificial Intelligence, 4, pp. 281–300, 1991.
Gammerman, A., “Geometric Analogy Problems by Minimum Length Encoding,”4th Conference of the International Federation of Classification Societies (IFCS-93), Paris, August–September 1993.
Gazdar, G. and Mellish, C.,Natural Language Processing in Prolog, Wokingham: Addison-Wesley, 1989.
Harris, Z. S., “Distributional Structure,”Linguistics Today, 10, pp. 146–162, 1954.
Hald, G. and Marshall, T. R.,Data Compression: Techniques and Applications, Hardware and Software Considerations, second edition, Chichester: Wiley, 1987.
Hinton, G. E. and Sejnowski, T. J., “Learning and Relearning in Boltzmann Machines,” inParallel Distributed Processing, Vol. 1 (D. E. Rumelhart and J. L. McClelland, eds.), Cambridge Mass.: MIT Press, pp. 282–317, 1986.
Hopfield, J. J., “Neural Networks and Physical Systems with Emergent Collective Properties,”Proceedings of the National Academy of Science, USA 79, pp. 2554–2558, 1982.
Kolmogorov, A. N., “Three Approaches to the Quantitative Definition of Information,”Problems of Information Transmisson, 1, 1, pp. 1–7, 1965.
Kumar, V., “Algorithms for Constraint-Satisfaction Problems,”AI Magazine, 13, 1, pp. 32–44, 1992.
Li, M. and Vitanyi, P. M. B., “Kolmogorov Complexity and Its Applications,” inHandbook of Theoretical Computer Science (J. van Leeuwen, ed.), Chapter 4, Amsterdam: Elsevier, pp. 188–254, 1990.
Li M. and Vitanyi, P. M. B., “Inductive Reasoning and Kolmogorov Complexity,”Journal of Computer and System Sciences, 44, pp. 343–384, 1992.
Linsker, R., “Self-Organization in a Perceptual Network,”IEEE Computer, 21, pp. 105–117, 1988.
Mahowald, M. A. and Mead, C., “The Silicon Retina,”Scientific American, 264, 5, pp. 40–47, 1991.
Mandrioli, D. and Ghezzi, C.,Theoretical Foundations of Computer Science, New York: Wiley, 1987.
Muggleton, S., “Inductive Logic Programming,”New Generation Computing, 8, 4, pp. 295–318, 1991.
Newell, A.,Unified Theories of Cognition, Cambridge, Mass.: Harvard University Press, 1990.
Newell, A., Shaw, J. C. and Simon, H., “Elements of a Theory of Human Problem Solving,”Psychological Review, 65, pp. 151–166, 1958.
Oja, E., “A Simplified Neuron Model as a Principal Component Analyser,”Journal of Mathematical Biology, 15, pp. 267–273, 1982.
Oldfied, R. C., “Memory Mechanisms and the Theory of Schemata,”British Journal of Psychology, 45, pp. 14–23, 1954.
Pednault, E. P. D., “Minimal Length Encoding and Inductive Inference,” inKnowledge Discovery in Databases (G. Piatetsky-Shapiro and W. J. Frawley, eds.), Cambridge, Mass.: MIT Press, pp. 71–92, 1991.
Phillips, W. A., Hay, I. M., and Smith, L. S., “Lexicality and Pronunciation in a Simulated Neural Net,”British Journal of Mathematical and Statistical Psychology, 46, pp. 193–205, 1993.
Redlich, A. N., “Redundancy Reduction as a Strategy for Unsupervised Learning,”Neural Computation, 5, pp. 289–304, 1993.
Rissanen, J., “Modelling by the Shortest Data Description,”Automatica-J. IFAC, 14, pp. 465–471, 1978.
Rissanen, J., “Stochastic Complexity,”Journal of the Royal Statistical Society, B 49, 3, pp. 223–239 and pp. 252–265, 1987.
Sanger, T. D., “Optimal Unsupervised Learning in a Single-Layer Linear Feed-Forward Network,”Neural Networks, 2, pp. 459–473, 1989.
Shannon, C. E. and Weaver, W., “The Mathematical Theory of Communication, Urbana: University of Illinois Press, 1949.
Solomonoff, R. J., “A Formal Theory of Inductive Inference. Parts I and II,”Information and Control, 7, pp. 1–22 and pp. 224–254, 1964.
Solomonoff, R. J., “The Application of Algorithmic Probability to Problems in Artificial Intelligence,” inUncertainty in Artificial Intelligence (L. N. Karnal and J. F. Lemmer, eds.), Elsevier Science, pp. 473–491, 1986.
Stanfill, C. and Waltz, D., “Toward Memory-Based Reasoning,”Communications of the ACM, 29, 12, pp. 1213–1228.
Storer, J. A.,Data Compression: Methods and Theory, Rockville, Maryland: Computer Science Press, 1988.
Southcott, C. B., Boyd, I., Coleman, A. E. and Hammett, P. G., “Low Bit Rate Speech Coding for Practical Applications,” inSpeech and Language Processing (C. Wheddon and R. Linggard, eds.), London: Chapman & Hall, 1990.
Stephen, G. A. and Mather, P., “Sweeping away the Problems That Dog the Industry?”AI Communications, 6, 3/4, pp. 213–218, 1993.
Sudkamp, T. A.,Languages and Machines, an Introduction to the Theory of Computer Science, Reading, Mass.: Addison-Wesley, 1988.
Uspensky, V. A., “Kolmogorov and Mathematical Logic,”Journal of Symbolic Logic, 57, 2, pp. 385–412, 1992.
Von Békésy, G.,Sensory Inhibition, Princeton, NJ: Princeton University Press, 1967.
Wallace, C. S. and Boulton, D. M., “An Information Measure for Classification,”Computer Journal, 11, 2, pp. 185–195, 1968.
Wallace, C. S. and Freeman, P. R., “Estimation and Inference by Compact Coding,”Journal of the Royal Statistical Society, B 49, 3, pp. 240–252, 1987.
Watanabe, S., “Pattern Recognition as Information Compression,” inFrontiers of Pattern Recognition (S. Watanabe, ed.), New York: Academic Press, 1972.
Watanabe, S.,Pattern Recognition: Human and Mechanical, New York: Wiley, 1985.
Winston, P. H.,Artificial Intelligence, third edition, Reading, Mass.: Addison-Wesley, 1992.
Wolff, J. G., “Language Acquisition, Data Compression and Generalisation,”Language & Communication, 2, pp. 57–89, 1982. (reproduced in Ref. 63), chapter 3).
Wolff, J. G., “Learning Syntax and Meanings through Optimization and Distributional Analysis,” inCategories and Processes in Language Acquisition (Y. Levy, I. M. Schlesinger, and M. D. S. Braine, eds.), Hillsdale, N. J.: Lawrence, Erlbaum, pp. 179–215, 1988. (reproduced in Ref. 63), Chapter 2).
Wolff, J. G., “The Management of Risk in System Development: ‘Project SP’ and the ‘New Spiral Model’,”Software Engineering Journal, 4, 3, pp. 134–142, 1989.
Wolff, J. G., “Simplicity and Power: Some Unifying Ideas in Computing,”Computer Journal, 33, 6, pp. 518–534, 1990. (reproduced in Ref. 63), Chapter 4).
Wolff, J. G.,Towards a Theory of Cognition and Computing, Chichester: Ellis Horwood, 1991.
Wolff, J. G., “On the Integration of Learning, Logical Deduction and Probabilistic Inductive Inference,”Proceedings of the First International Workshop on Inductive Logic Programming, Viana de Castelo, Portugal, pp. 177–191, March 1991.
Wolff, J. G., “Computing, Cognition and Information Compression,”AI Communications, 6, 2, pp. 107–127, 1993.
Wolff, J. G., “Towards a New Concept of Software,”Software Engineering Journal, 9, 1, pp. 27–38, 1994.
Wolff, J. G., “A Scaleable Technique for Best-Match Retrieval of Sequential Information Using Metrics-Guided Search,”Journal of Information Science, 20, 1, pp. 16–28, 1994.
Wolff, J. G., “Computing as Compression: SP20,”New Generation Computing, 13, 2, pp. 215–241.
Wolff, J. G., “Computing and Information Compression: A Reply,”AI Communications, 7, 3/4, pp. 203–219, 1994.
Wolff, J. G., “An Alternative Scaleable Technique for Best-Match Retrieval of Sequential Information Using Metrics-Guided Search,” in preparation.
Wolff, J. G. and Chipperfield, A. J., “Unifying Computing: Inductive Learning and Logic,” inResearch and Development in Expert Systems, VII (T. R. Addis and R. M. Muir, eds.), (Proceedings of Expert Systems’90, Reading, England, September 1990), pp. 263–276, 1990.
Zipf, G. K.,Human Behaviour and the Principle of Least Effort, Cambridge, Mass.: Addison-Wesley, 1949.
Author information
Authors and Affiliations
About this article
Cite this article
Wolff, J.G. Computing as compression: An overview of the SP theory and system. New Gener Comput 13, 187–214 (1995). https://doi.org/10.1007/BF03038313
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF03038313