Abstract
A central question in the empirical sciences is; given a body of data how do we best attempt to make predictions? There are subtle differences between current approaches which include Minimum Message Length (MML) and Solomonoff’s theory of induction [24].
The nature of hypothesis spaces is explored and we observe a correlation between the complexity of a function and the frequency with which it is represented. There is not a single best hypothesis, as suggested by Occam’s razor (which says prefer the simplest), but a set of functionally equivalent hypotheses. One set of hypotheses is preferred over another set because it is larger, thus giving the impression simpler functions generalize better. The probabilistic weighting of one set of hypotheses is given by the relative size of its equivalence class. We justify Occam’s razor by a counting argument over the hypothesis space.
Occam’s razor contrasts with the No Free Lunch theorems which state that it impossible for one machine learning algorithm to generalize better than any other. No Free Lunch theorems assume a distribution over functions, whereas Occam’s razor assumes a distribution over programs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Auger, A., Teytaud, O.: Continuous lunches are free plus the design of optimal optimization algorithms. Algorithmica 57(1), 121–146 (2010)
Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience, New York (1991)
Domingos, P.: The role of occam’s razor in knowledge discovery. Data Min. Knowl. Discov. 3(4), 409–425 (1999)
Dowe, D.L.: MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness. In: Handbook of the Philosophy of Science (HPS). Philosophy of Statistics, vol. 7, pp. 901–982 (2011)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience (November 2000)
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. In: Machine Learning, pp. 63–91 (1993)
Hutter, M.: A complete theory of everything (will be subjective). Algorithms 3(7), 360–374 (2010)
Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability, 300 pages. Springer, Berlin (2004), http://www.idsia.ch/~marcus/ai/uaibook.htm
Kearns, M.J., Vazirani, U.V.: An introduction to computational learning theory. MIT Press, Cambridge (1994)
Koza, J.R.: Genetic Programming II: Automatic Discovery of Reusable Programs. The MIT Press, Cambridge (1994)
Langdon, W.B.: Scaling of program functionality. Genetic Programming and Evolvable Machines 10(1), 5–36 (2009)
William, B.: Langdon. Scaling of program fitness spaces. Evolutionary Computation 7(4), 399–428 (1999)
Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer-Verlag New York, Inc., Secaucus (1997)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Murphy, P.M., Pazzani, M.J.: Exploring the decision forest: An empirical investigation of occams razor in decision tree induction. Journal of Artificial Intelligence Research, 257–275 (1994)
Needham, S.L., Dowe, D.L.: Message length as an effective ockham’s razor in decision tree induction. In: Proc. 8th International Workshop on Artificial Intelligence and Statistics (AI+STATS 2001), Key West, Florida, U.S.A., pp. 253–260 (January 2001)
Poli, R., Graff, M., McPhee, N.F.: Free lunches for function and program induction. In: Proceedings of the Tenth ACM SIGEVO Workshop on Foundations of Genetic Algorithms (FOGA 2009), Orlando, Florida, USA, January 9-11, pp. 183–194. ACM (2009)
Rogers, H.: Theory of recursive functions and effective computability. McGraw-Hill series in higher mathematics. MIT Press (1987)
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education (2003)
Schaffer, C.: A conservation law for generalization performance. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 259–265. Morgan Kaufmann (1994)
Solomonoff, R.: Machine learning - past and future. In: The Dartmouth Artificial Intelligence Conference, AI@50, pp. 257–275. Dartmouth, N.H. (2006)
Solomonoff, R.J.: A formal theory of inductive inference. part i. Information and Control 7(1), 1–22 (1964)
Solomonoff, R.J.: A formal theory of inductive inference. part ii. Information and Control 7(2), 224–254 (1964)
Wallace, C.S., Dowe, D.L.: Minimum message length and kolmogorov complexity. Computer Journal 42, 270–283 (1999)
Webb: Generality is more significant than complexity: Toward an alternative to occam’s razor. In: Australian Joint Conference on Artificial Intelligence (AJCAI) (1994)
Wolpert, D.H., Macready, W.G.: No free lunch theorems for search. Technical Report SFI-TR-95-02-010, Santa Fe, NM (1995)
Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1), 67–82 (1997)
Woodward, J.R.: Complexity and cartesian genetic programming. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 260–269. Springer, Heidelberg (2006)
Woodward, J.R.: Invariance of function complexity under primitive recursive functions. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 310–319. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Woodward, J., Swan, J. (2013). A Syntactic Approach to Prediction. In: Dowe, D.L. (eds) Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence. Lecture Notes in Computer Science, vol 7070. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-44958-1_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-44958-1_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-44957-4
Online ISBN: 978-3-642-44958-1
eBook Packages: Computer ScienceComputer Science (R0)