Skip to main content

Do Machine-Learning Machines Learn?

  • Conference paper
  • First Online:

Part of the book series: Studies in Applied Philosophy, Epistemology and Rational Ethics ((SAPERE,volume 44))

Abstract

We answer the present paper’s title in the negative. We begin by introducing and characterizing “real learning” (\(\mathcal {RL}\)) in the formal sciences, a phenomenon that has been firmly in place in homes and schools since at least Euclid. The defense of our negative answer pivots on an integration of reductio and proof by cases, and constitutes a general method for showing that any contemporary form of machine learning (ML) isn’t real learning. Along the way, we canvass the many different conceptions of “learning” in not only AI, but psychology and its allied disciplines; none of these conceptions (with one exception arising from the view of cognitive development espoused by Piaget), aligns with real learning. We explain in this context by four steps how to broadly characterize and arrive at a focus on \(\mathcal {RL}\).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The need for the qualifications (i.e. determinate, non-question-begging) should be obvious. The answer to the present paper’s title that a machine which machine-learns by definition learns, since ‘learn’ appears in ‘machine-learn,’ assumes at the outset that what is called ‘machine learning’ today is real learning—but that’s precisely what’s under question; hence the petitio.

  2. 2.

    All mathematical models of learning relevant to the present discussion that we are aware of take learning to consist fundamentally in the learning of number-theoretic functions from \(\mathbb {N} \times \mathbb {N} \times \cdots \times \mathbb {N}\) to \(\mathbb {N}\). Even when computational learning was firmly and exclusively rooted in classical recursion theory, and dedicated statistical formalisms were nowhere to be found, the target of learning was a function of this kind; see e.g. (Gold 1965; Putnam 1965), a modern, comprehensive version of which is given in (Jain et al. 1999). We have been surprised to hear that some in our audience aren’t aware of the basic, uncontroversial fact, readily appreciated by consulting the standard textbooks we cite here and below, that machine learning in its many guises takes the target of learning to be number-theoretic functions. A “shortcut” to grasping a priori that all systematic, rigorously described forms of learning in matters and activities computational and mechanistic must be rooted in number-theoretic functions, is to simply note that computer science itself consists in the study and embodiment of number-theoretic functions, defined and ordered in hierarchies (e.g. see Davis and Weyuker 1983). We by the way focus herein on unary functions \(f : \mathbb {N} \mapsto \mathbb {N}\) only for ease of exposition.

  3. 3.

    A biconditional isn’t needed. We use only a weaker set of necessary conditions, not a set of necessary and sufficient conditions.

  4. 4.

    Not to be be confused with RL, reinforcement learning, in which real learning, as revealed herein, doesn’t happen.

  5. 5.

    As many readers will know, Searle’s (1980) Chinese Room Argument (CRA) is intended to show that computing machines can’t understand anything. It’s true that Bringsjord has refined, expanded, and defended CRA (e.g. see Bringsjord 1992, Bringsjord and Noel 2002; Bringsjord 2015), but bringing to bear here this argumentation in support of the present paper’s main claim would instantly demand an enormous amount of additional space. And besides, as we now explain, calling upon this argumentation is unnecessary.

  6. 6.

    Since at bottom, as noted (see note 2), the target of learning should be taken for generality and rigor to be a number-theoretic function, it’s natural to consider learning in the realm of the formal sciences.

  7. 7.

    Just as (computer) programs can be correct or incorrect, so too proofs can be correct or incorrect. For more on this, see e.g. (Arkoudas and Bringsjord 2007).

  8. 8.

    If we regard Turing to have been speaking of modern AI in his famous (Turing, 1950), note then too that his orientation is test-based: he gave here of course the famous ‘Turing Test.’.

  9. 9.

    In fact, this is why real learning for humans in mathematics is challenging; see e.g. (Moore 1994).

  10. 10.

    Our assumption here thus specifically invokes connectionist ML. But this causes no loss of generality, as we explain by way the “tour” of ML taken in Sect. 6.1, and the fact that the proof in the Appendix, as explained there, is a general method that will work form any contemporary form of ML.

  11. 11.

    This is a rough-and-ready extraction from (Jain et al. 1999), and must be sufficient given the space limitations of the present short paper, at least for now. Of course, there are many forms of ML/machine learning in play in AI of today. In Sect. 6.1 we consider different forms of ML in contemporary AI. In Sect. 6.2 we consider different types of “learning” in psychology and allied disciplines.

  12. 12.

    Lathrop (1996) shows, it might be asserted, that uncomputable functions can be machine-learned. But in his scheme, there is only a probabilistic approximation of real learning, and—in clear tension with (c1\('\))–(c3)—no proof in support of the notion that anything has been learned. The absence of such proofs is specifically called out in the formal deduction given in the Appendix.

  13. 13.

    A pair of additional works help to further seal our case: (Kearns and Vazirani 1994; Shalev- Shwartz and Ben-David 2014). Study of these texts will reveal that \(\mathcal {RL}\) as per (c1\('\))–(c3) is nowhere to be found.

  14. 14.

    We of course join epistemological cognoscenti in being aware of Gettier-style cases, but they can be safely left aside here. For the record, Bringsjord claims to have a solution anyway—one that is at least generally in the spirit of Chisholm’s (1966) proposed solution, which involves requiring that the justification in justified-true-belief accounts of knowledge be of a certain sort. For Gettier’s landmark paper, see (Gettier 1963).

  15. 15.

    Specifically, we shall see that the formal deduction of the Appendix is actually a method for showing that other forms of “modern” ML, not just those that rely on ANNs, don’t enable machines to really learn anything. E.g., the method can take Bayesian learning in, and yield as output that such learning isn’t real learning.

  16. 16.

    Shakespeare himself, or better yet even Ibsen, or better better yet Bellow, couldn’t have invented a story dripping with this much irony—a story in which the machine-learning people persecuted the logicians for building “brittle” systems, and then the persecutors promptly proceeded to blithely build comically brittle systems as their trophies (given to themselves).

  17. 17.

    In which is by the way cited hypercomputational artificial neural networks.

  18. 18.

    E.g. even beginning textbooks introducing single-variable differential/integral calculus ask for verification of human learning by asking for proofs. The cornerstone and early-on-introduced concept of a limit is accordingly accompanied by requests to students that they supply proofs in order to confirm that they understand this concept. Thus we e.g. have on p. 67 of (Stewart, 2016) a request that our reader prove that

    $$\begin{aligned} \lim _{x \rightarrow 3} g(x) = (4x - 5) = 7 \end{aligned}$$

    . What machine-learning machine that has learned the function g here can do that?

  19. 19.

    This is essentially the Short Short Story Game of (Bringsjord 1998), much harder than such Turing-computable games as Checkers, Chess, and Go, which are all at the same easy level of difficulty (EXPTIME).

  20. 20.

    Outside of the present paper, we have carried out a second analysis that confirms this, by examining learning in AI as characterized in (Russell and Norvig 2009), and invite skeptical readers to carry out their own analysis for this textbook, and indeed for any comprehensive, mainstream textbook. The upshot will be the stark fact that \(\mathcal {RL}\), firmly in place since Euclid as what learning in the formal sciences is, will be utterly absent.

  21. 21.

    Instead of looking to published attempts to systematically present AI (such as the textbooks upon which we rely herein), one could survey practitioners in AI, and see if their views harmonize with the publications explicitly designed to present all of AI (from a high-altitude perspective). E.g., one could turn to such reports as (Müller and Bostrom 2016), in which the authors report on a specific question, given at a conference that celebrated AI’s “turning 50” (AI@50), which asked for an opinion as to the earliest date (computing) machines would be able to simulate human-level learning. It’s rather interesting that 41% of respondents said this would never happen. It would be interesting to know if, in the context of the attention ML receives these days, the number of these pessimists would be markedly smaller. If so, that may well be because, intuitively, plenty of people harbor suspicions that ML in point of fact hasn’t achieved any human-level real learning.

  22. 22.

    Luger’s book revolves around a fundamental distinction between what he calls weak problem-solving versus strong problem-solving.

  23. 23.

    There are a few exceptions. Hummel (2010) has explained that sophisticated and powerful forms of symbolic learning, ones aligned with second-order logic, are superior to associative forms of learning. Additionally, there’s one clear historical exception, but it’s now merely a sliver in psychology (specifically, in psychology of reasoning), and hence presently has insufficient adherents to merit inclusion in the ontology we now proceed to canvass. We refer here to the type of learning over the years of human development and formal education posited by Piaget; e.g. see (Inhelder and Piaget 1958). Piaget’s view, in a barbaric nutshell, is that, given solid academic education, nutrition, and parenting, humans develop the capacity to reason with and even eventually over first-order and modal logic—which means that such humans would develop the capacity to learn in \(\mathcal {RL}\) fashion, in school. Since attacks on Piaget’s view, starting originally with those of Wason and Johnson-Laird (e.g. see Wason and Johnson-Laird 1972), many psychologists have rejected Piaget’s position. For what it’s worth, Bringsjord has defended Piaget; see e.g. (Bringsjord et al. 1998).

  24. 24.

    We are happy to concede that years of laborious (and tedious?) study of conditioning using appetitive and aversive reinforcement (and such phenomena as inhibitory conditioning, conditioned suppression, higher-order conditioning, conditioned reinforcement, and blocking) has revealed that conditioning can’t be literally reduced to new reflexes, but there is no denying that in conditioning, any new knowledge and representation that takes form falls light years short of \(\mathcal {RL}\).

  25. 25.

    Note that all occurrences of ‘understanding’ in the itemized list that follows, in keeping with the psychometric operationalization introduced at the outset in order not to rely on the murky concept of understanding, could be invoked here; but doing so would take much space and time, and be quite inelegant.

  26. 26.

    Peano Arithmetic (PA) is rarely introduced by name in K–12 education, but all the axioms of it, save perhaps for the Induction Schema, are introduced and taught there.

  27. 27.

    This conception matches that of an agent in orthodox AI: see the textbooks, e.g. (Luger 2008; Russell and Norvig 2009).

References

  • Achab, M., Bacry, E., Gaïffas, S., Mastromatteo, I., Muzy, J.F.: Uncovering causality from multivariate Hawkes integrated cumulants. In: Precup, D., Teh, Y.W. (eds) Proceedings of the 34th International Conference on Machine Learning, PMLR, International Convention Centre, Sydney, Australia. Proceedings of Machine Learning Research, vol. 70, pp. 1–10 (2017). http://proceedings.mlr.press/v70/achab17a.html

  • Arkoudas, K.: Denotational proof languages. Ph.D. thesis, MIT (2000)

    Google Scholar 

  • Arkoudas, K., Bringsjord, S.: Computers, justification, and mathematical knowledge. Minds Mach. 17(2), 185–202 (2007)

    Article  Google Scholar 

  • Arkoudas, K., Musser, D.: Fundamental Proof Methods in Computer Science: A Computer-Based Approach. MIT Press, Cambridge (2017)

    MATH  Google Scholar 

  • Bandura, A., Walters, R.H.: Social Learning Theory, vol. 1. Prentice-Hall, Englewood Cliffs (1977)

    Google Scholar 

  • Bandura, A., Ross, D., Ross, S.A.: Transmission of aggression through imitation of aggressive models. J. Abnorm. Soc. Psychol. 63(3), 575 (1961)

    Article  Google Scholar 

  • Barrett, L.: Beyond the Brain: How Body and Environment Shape Animal and Human Minds. Princeton University Press, Princeton (2015)

    Google Scholar 

  • Bellman, A., Bragg, S., Handlin, W.: Algebra 2: Common Core. Pearson, Upper Saddle River (2012). Series Authors: Charles, R., Kennedy, D., Hall, B., Consulting Authors: Murphy, S.G

    Google Scholar 

  • Boolos, G.S., Burgess, J.P., Jeffrey, R.C.: Computability and Logic, 4th edn. Cambridge University Press, Cambridge (2003)

    MATH  Google Scholar 

  • Bringsjord, S.: What Robots Can and Can’t Be. Kluwer, Dordrecht (1992)

    Book  Google Scholar 

  • Bringsjord, S.: Chess is too easy. Technol. Rev. 101(2), 23–28 (1998). http://kryten.mm.rpi.edu/SELPAP/CHESSEASY/chessistooeasy.pdf

    Google Scholar 

  • Bringsjord, S.: Psychometric artificial intelligence. J. Exp. Theor. Artif. Intell. 23(3), 271–277 (2011)

    Article  Google Scholar 

  • Bringsjord, S.: The symbol grounding problem-remains unsolved. J. Exp. Theor. Artif. Intell. 27(1), 63–72 (2015). https://doi.org/10.1080/0952813X.2014.940139

    Article  Google Scholar 

  • Bringsjord, S., Arkoudas, K.: The modal argument for hypercomputing minds. Theor. Comput. Sci. 317, 167–190 (2004)

    Article  MathSciNet  Google Scholar 

  • Bringsjord, S., Noel, R.: Real robots and the missing thought experiment in the Chinese room dialectic. In: Preston, J., Bishop, M. (eds.) Views into the Chinese Room: New Essays on Searle and Artificial Intelligence, pp. 144–166. Oxford University Press, Oxford (2002)

    Google Scholar 

  • Bringsjord, S., Schimanski, B.: What is artificial intelligence? psychometric AI as an answer. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), pp. 887–893. Morgan Kaufmann, San Francisco (2003). http://kryten.mm.rpi.edu/scb.bs.pai.ijcai03.pdf

  • Bringsjord, S., Zenzen, M.: Superminds: People Harness Hypercomputation, and More. Kluwer Academic Publishers, Dordrecht (2003)

    Book  Google Scholar 

  • Bringsjord, S., Bringsjord, E., Noel, R.: In defense of logical minds. In: Proceedings of the 20th Annual Conference of the Cognitive Science Society, pp. 173–178. Lawrence Erlbaum, Mahwah (1998)

    Google Scholar 

  • Bringsjord, S., Kellett, O., Shilliday, A., Taylor, J., van Heuveln, B., Yang, Y., Baumes, J., Ross, K.: A new Gödelian argument for hypercomputing minds based on the busy beaver problem. Appl. Math. Comput. 176, 516–530 (2006)

    MathSciNet  MATH  Google Scholar 

  • Chisholm, R.: Theory of Knowledge. Prentice-Hall, Englewood Cliffs (1966)

    Google Scholar 

  • Davis, M., Weyuker, E.: Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science, 1st edn. Academic Press, New York (1983)

    MATH  Google Scholar 

  • Dodig-Crnkovic, G., Giovagnoli, R. (eds.): Computing Nature: Turing Centenary Perspective. Springer, Berlin (2013). https://www.springer.com/us/book/9783642372247

    Google Scholar 

  • Domjan, M.: The Principles of Learning and Behavior, 7th edn. Cengage Learning, Stamford (2015)

    Google Scholar 

  • Gallistel, C.R.: Learning and representation. In: Learning and Memory: A Comprehensive Reference, vol. 1. Elsevier (2008) https://doi.org/10.1016/j.neuron.2017.05.021

  • Gettier, E.: Is justified true belief knowledge? Analysis 23, 121–123 (1963). http://www.ditext.com/gettier/gettier.html

    Article  Google Scholar 

  • Gold, M.: Limiting recursion. J. Symb. Logic 30(1), 28–47 (1965)

    Article  MathSciNet  Google Scholar 

  • Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). http://www.deeplearningbook.org

    MATH  Google Scholar 

  • Goodstein, R.: On the restricted ordinal theorem. J. Symb. Logic 9(31), 33–41 (1944)

    Article  MathSciNet  Google Scholar 

  • Huitt, W.: Classroom Instruction. Educational Psychology Interactive (2003)

    Google Scholar 

  • Hummel, J.: Symbolic versus associative learning. Cogn. Sci. 34(6), 958–965 (2010)

    Article  MathSciNet  Google Scholar 

  • Inhelder, B., Piaget, J.: The Growth of Logical Thinking from Childhood to Adolescence. Basic Books, New York (1958)

    Book  Google Scholar 

  • Jain, S., Osherson, D., Royer, J., Sharma, A.: Systems That Learn: An Introduction to Learning Theory, 2nd edn. MIT Press, Cambridge (1999)

    Google Scholar 

  • Kearns, M., Vazirani, U.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)

    Google Scholar 

  • Kitzelmann, E.: Inductive programming: a survey of program synthesis techniques. In: International Workshop on Approaches and Applications of Inductive Programming, pp 50–73. Springer (2009)

    Google Scholar 

  • Lathrop, R.: On the learnability of the uncomputable. In: Saitta, L. (ed.) Proceedings of the 13th International Conference on Machine Learning, The conference was held in Italy, 3–6 July 1996, pp 302–309. Morgan Kaufman, San Francisco (1996). https://pdfs.semanticscholar.org/6919/b6ad91d9c3aa47243c3f641ffd30e0918a46.pdf

  • LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  • Luger, G.: Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 6th edn. Pearson, London (2008)

    Google Scholar 

  • Mackintosh, N.J.: Conditioning and Associative Learning. Calendron Press, Oxford (1983)

    Google Scholar 

  • Marblestone, A.H., Wayne, G., Kording, K.P.: Toward an integration of deep learning and neuroscience. Front. Comput. Neurosci. 10(94) (2016). https://doi.org/10.3389/fncom.2016.00094

  • Moore, R.C.: Making the transition to formal proof. Educ. Stud. Math. 27(3), 249–266 (1994)

    Article  Google Scholar 

  • Müller, V., Bostrom, N.: Future progress in artificial intelligence: a survey of expert opinion. In: Müller, V. (ed.) Fundamental Issues of Artificial Intelligence (Synthese Library), pp. 553–571. Springer, Berlin (2016)

    Google Scholar 

  • Penn, D., Holyoak, K., Povinelli, D.: Darwin’s mistake: explaining the discontinuity between human and nonhuman minds. Behav. Brain Sci. 31, 109–178 (2008)

    Google Scholar 

  • Putnam, H.: Trial and error predicates and a solution to a problem of Mostowski. J. Symbolic Logic 30(1), 49–57 (1965)

    Article  MathSciNet  Google Scholar 

  • Rado, T.: On non-computable functions. Bell Syst. Tech. J. 41, 877–884 (1963)

    Article  MathSciNet  Google Scholar 

  • Ross, J.: Immaterial aspects of thought. J. Philos. 89(3), 136–150 (1992)

    Article  Google Scholar 

  • Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2009)

    MATH  Google Scholar 

  • Schapiro, A., Turk-Browne, N.: Statistical learning. Brain Mapp. Encyclopedic Ref. 3, 501–506 (2015)

    Article  Google Scholar 

  • Searle, J.: Minds, brains and programs. Behav. Brain Sci. 3, 417–424 (1980)

    Article  Google Scholar 

  • Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)

    Book  Google Scholar 

  • Stewart, J.: Calculus. 8th edn. Cengage Learning, Boston (2016), We refer here to an electronic version of the print textbook. The “Student Edition” of the hard-copy textbook has an ISBN 978-1-305-27176-0

    Google Scholar 

  • Titley, H.K., Brunel, N., Hansel, C.: Toward a neurocentric view of learning. Neuron 95(1), 19–32 (2017)

    Article  Google Scholar 

  • Turing, A.: Computing machinery and intelligence. Mind LIX 59(236), 433–460 (1950)

    Article  MathSciNet  Google Scholar 

  • Wason, P., Johnson-Laird, P.: Psychology of Reasoning: Structure and Content. Harvard University Press, Cambridge (1972)

    Google Scholar 

  • Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)

    Article  Google Scholar 

Download references

Acknowledgement

We are deeply appreciative of feedback received at PT-AI 2017, the majority of which is addressed herein. The first author is also specifically indebted to John Hummel for catalyzing, in vibrant discussions at MAICS 2017, the search for formal arguments and/or theorems establishing the proposition Hummel and Bringsjord co-affirm: viz., statistical machine learning simply doesn’t enable machines to actually learn, period. Bringsjord is also thankful to Sergei Nirenburg for valuable conversations. Many readers of previous drafts have been seduced by it’s-not-really-learning forms of learning (including worse-off-than artificial neural network (ANN) based deep learning (DL) folks: Bayesians), and have offered spirited objections, all of which are refuted herein; yet we are grateful for the valiant tries. Bertram Malle stimulated and guided our sustained study of types of learning in play in psychology\(^+\), and we are thankful. Jim Hendler graciously read an early draft; his resistance has been helpful (though perhaps now he’s a convert). The authors are also grateful for five anonymous reviews, some portions of which reflected at least partial and passable understanding of our logico-mathematical perspective, from which informal notions of “learning” are inadmissible in such debates as the present one. Two perspicacious comments and observations from two particular PT-AI 2017 participants, subsequent to the conference, proved productive to deeply ponder. We acknowledge the invaluable support of “Advanced Logicist Machine Learning” from ONR, and of “Great Computational Intelligence” from AFOSR. Finally, without the wisdom, guidance, leadership, and raw energy of Vincent Müller, PT-AI 2017, and any ideas of ours that have any merit at all, and that were expressed there and/or herein, would not have formed.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Selmer Bringsjord .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bringsjord, S., Govindarajulu, N.S., Banerjee, S., Hummel, J. (2018). Do Machine-Learning Machines Learn?. In: Müller, V. (eds) Philosophy and Theory of Artificial Intelligence 2017. PT-AI 2017. Studies in Applied Philosophy, Epistemology and Rational Ethics, vol 44. Springer, Cham. https://doi.org/10.1007/978-3-319-96448-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-96448-5_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-96447-8

  • Online ISBN: 978-3-319-96448-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics