Skip to main content

LLM Cognitive Judgements Differ from Human

  • Conference paper
  • First Online:
Frontiers of Artificial Intelligence, Ethics, and Multidisciplinary Applications (FAIEMA 2023)


Large Language Models (LLMs) have lately been on the spotlight of researchers, businesses, and consumers alike. While the linguistic capabilities of such models have been studied extensively, there is growing interest in investigating them as cognitive subjects. In the present work, I examine GPT-3 and ChatGPT capabilities on an limited data inductive reasoning task from the cognitive science literature. The results suggest that these models’ cognitive judgements are not human like.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


  1. 1.

    Data source: Google Trends (

  2. 2.


  • Surging stockmarkets are powered by artificial intelligence (2023) The economist (June 10th 2023)

    Google Scholar 

  • Bang Y, Cahyawijaya S, Lee N, Dai W, Su D, Wilie B, Lovenia H, Ji Z, Yu T, Chung W et al (2023) A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv:2302.04023

  • Bender EM, Gebru T, McMillan-Major A, Shmitchell S (2021) On the dangers of stochastic parrots: can language models be too big. In: Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pp. 610–623

    Google Scholar 

  • Binz M, Schulz E (2023) Using cognitive psychology to understand GPT-3. Proc Natl Acad Sci 120(6):e2218523120

    Article  Google Scholar 

  • Borji A (2023) A categorical archive of chatgpt failures. arXiv:2302.03494

  • Bubeck S, Chandrasekaran V, Eldan R, Gehrke J, Horvitz E, Kamar E, Lee P, Lee YT, Li Y, Lundberg S et al (2023) Sparks of artificial general intelligence: early experiments with GPT-4. arXiv:2303.12712

  • Cai ZG, Haslett DA, Duan X, Wang S, Pickering MJ (2023) Does chatgpt resemble humans in language use? arXiv:2303.08014

  • Chomsky N, Roberts I, Watumull J (2023) Noam Chomsky: the false promise of ChatGPT. The New York Times 8

    Google Scholar 

  • Ettinger A (2020) What BERT is not: lessons from a new suite of psycholinguistic diagnostics for language models. Trans Assoc Comput Linguist 8:34–48

    Article  Google Scholar 

  • Floridi L (2023) AI as agency without intelligence: on ChatGPT, large language models, and other generative models. Philos & Technol 36(1):15

    Article  Google Scholar 

  • Freund L Exploring the intersection of rationality, reality, and theory of mind in AI reasoning: an analysis of GPT-4’s responses to paradoxes and tom tests

    Google Scholar 

  • Griffiths TL, Tenenbaum JB (2006) Optimal predictions in everyday cognition. Psychol Sci 17(9):767–773

    Article  Google Scholar 

  • Gulordava K, Bojanowski P, Grave E, Linzen T, Baroni M (2018) Colorless green recurrent networks dream hierarchically. arXiv:1803.11138

  • Holterman B, van Deemter K (2023) Does chatgpt have theory of mind? arXiv:2305.14020

  • Katzir R (2023) Why large language models are poor theories of human linguistic cognition. A reply to Piantadosi (2023). Manuscript. Tel Aviv University.

  • Lipkin B, Wong L, Grand G, Tenenbaum JB (2023) Evaluating statistical language models as pragmatic reasoners. arXiv:2305.01020

  • Lloyd D (2023) What is it like to be a bot?: the world according to GPT-4. SSRN 4443727

    Google Scholar 

  • Loconte R, Orrù G, Tribastone M, Pietrini P, Sartori G (2023) Challenging chatgpt ’intelligence’ with human tools: a neuropsychological investigation on prefrontal functioning of a large language model. Intelligence

    Google Scholar 

  • Michaux C (2023) Can chat gpt be considered an author? i met with chat gpt and asked some questions about philosophy of art and philosophy of mind. SSRN 4439607

    Google Scholar 

  • OpenAI. Chatgpt. May 24 Version

  • OpenAI. Gpt-3. Model: davinci-003

  • Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A et al (2022) Training language models to follow instructions with human feedback. Adv Neural Inf Process Syst 35:27730–27744

    Google Scholar 

  • Qian P, Levy RP (2019) Neural language models as psycholinguistic subjects: representations of syntactic state. Association for Computational Linguistics

    Google Scholar 

  • Ribeiro MT, Wu T, Guestrin C, Singh S (2020) Beyond accuracy: behavioral testing of NLP models with checklist. arXiv:2005.04118

  • Scott AE, Neumann D, Niess J, Woźniak PW (2023) Do you mind? user perceptions of machine consciousness. In: Proceedings of the 2023 CHI conference on human factors in computing systems, pp. 1–19

    Google Scholar 

  • Taecharungroj V (2023) “What can ChatGPT do?’’ analyzing early reactions to the innovative AI chatbot on twitter. Big Data Cogn Comput 7(1):35

    Article  Google Scholar 

  • Warstadt A, Singh A, Bowman SR (2019) Neural network acceptability judgments. Trans Assoc Comput Linguist 7:625–641

    Article  Google Scholar 

  • Xu Q, Peng Y, Wu M, Xiao F, Chodorow M, Li P (2023) Does conceptual representation require embodiment? insights from large language models. arXiv:2305.19103

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sotiris Lamprinidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lamprinidis, S. (2024). LLM Cognitive Judgements Differ from Human. In: Farmanbar, M., Tzamtzi, M., Verma, A.K., Chakravorty, A. (eds) Frontiers of Artificial Intelligence, Ethics, and Multidisciplinary Applications. FAIEMA 2023. Frontiers of Artificial Intelligence, Ethics and Multidisciplinary Applications. Springer, Singapore.

Download citation

Publish with us

Policies and ethics