Skip to main content

Vision, Thinking, and Model-Based Inferences

  • Chapter
Springer Handbook of Model-Based Science

Part of the book series: Springer Handbooks ((SHB))

Abstract

Model-based reasoning refers to the sorts of inferences performed on the basis of a knowledge context that guides them. This context constitutes a model of a domain of reality, that is, an approximative and simplifying to various degrees representation of the factors that underlie, and the interrelations that govern, the behavior of this domain.

This chapter addresses both the problem of whether vision involves model-based inferences and, if yes, of what kind; and the problem of the nature of the context that acts as the model guiding visual inferences. It also addresses the broader problem of the relation between visual processing and thinking . To this end, the various modes of inferences, the most predominant conceptions about visual perception, the stages of visual processing, the problem of the cognitive penetrability of perception, and the logical status of the processes involved in all stages of visual processing will be discussed and assessed.

The goal of this chapter is, on the one hand, to provide the reader with an overview of the main broad problems that are currently debated in philosophy, cognitive science, and visual science, and, on the other hand, to equip them with the knowledge necessary to allow them to follow and assess current discussions on the nature of visual processes, and their relation to thinking and cognition in general.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 269.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 349.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

2-D:

two-dimensional

2.5-D:

two-and-a-half-dimensional

3-D:

three-dimensional

AI:

artificial intelligence

CI:

cognitively impenetrable

CP:

cognitive penetrability

ERP:

event-related potentials

FEF:

front eye fields

FFS:

feedforward sweep

GRP:

global recurrent processing

HSF:

high spatial frequency

IT:

inferotemporal cortex in the brain

LOC:

lateral occipital complex

LRP:

local recurrent processing

LSF:

low spatial frequency

LTM:

long term memory

RP:

recurrent processes

WM:

working memory

References

  1. H. von Helmholtz: Treatise on Psychological Optics (Dover, New York 1878/ 1925)

    Google Scholar 

  2. I. Rock: The Logic of Perception (MIT Press, Cambridge 1983)

    Google Scholar 

  3. E.S. Spelke: Object perception. In: Readings in Philosophy and Cognitive Science, ed. by A.I. Goldman (MIT Press, Cambridge 1988)

    Google Scholar 

  4. A. Clark: Whatever next? Predictive brains, situated agents, and the future of cognitive science, Behav. Brain Sci. 36, 181–253 (2013)

    Article  Google Scholar 

  5. M. Rescorla: The causal relevance of content to computation, Philos. Phenomenol. Res. 88(1), 173–208 (2014)

    Article  Google Scholar 

  6. L. Shams, U.R. Beierholm: Causal inference in perception, Trends Cogn. Sci. (Regul. Ed.) 14, 425–432 (2010)

    Article  Google Scholar 

  7. N. Orlandi: The Innocent Eye: Why Vision Is not a Cognitive Process (Oxford Univ. Press, Oxford 2014)

    Book  Google Scholar 

  8. P. Lipton: Inference to the Best Explanation, 2nd edn. (Routledge, London, New York 2004)

    Google Scholar 

  9. D.G. Campos: On the Distinction between Peirce’s Abduction and Lipton’s inference to the best explanation, Synthese 180, 419–442 (2011)

    Article  Google Scholar 

  10. G. Minnameier: Peirce-suit of Truth-why inference to the best explanation and abduction ought not to be confused, Erkenntnis 60, 75–105 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  11. G. Harman: Enumerative induction as inference to the best explanation, J. Philos. 68(18), 529–533 (1965)

    Google Scholar 

  12. D. Marr: Vision: A Computational Investigation into Human Representation and Processing of Visual Information (Freeman, San Francisco 1982)

    Google Scholar 

  13. J. Biederman: Recognition by components: A theory of human image understanding, Psychol. Rev. 94, 115–147 (1987)

    Article  Google Scholar 

  14. A. Johnston: Object constancy in face processing: Intermediate representations and object forms, Ir. J. Psychol. 13, 425–438 (1992)

    Google Scholar 

  15. G.W. Humphreys, V. Bruce: Visual cognition: Computational, Experimental and Neuropsychological Perspectives (Lawrence Erlbaum, Hove 1989)

    MATH  Google Scholar 

  16. J.J. Gibson: The Ecological Approach to Visual Perception (Houghton-Mifflin, Boston 1979)

    Google Scholar 

  17. J. Fodor, Z. Pylyshyn: How direct is visual perception? Some reflections on Gibson’s ‘Ecological Approach, Cognition 9, 139–196 (1981)

    Article  Google Scholar 

  18. M. Rowlands: The New Science of Mind: From Extended Mind to Embodied Phenomenology (MIT Press, Cambridge 2010)

    Book  Google Scholar 

  19. J. Norman: Two visual systems and two theories of perception: An attempt to reconcile the constructivist and ecological approaches, Behav. Brain Sci. 25, 73–144 (2002)

    Google Scholar 

  20. V. Bruce, P.R. Green: Visual Perception: Physiology, Psychology and Ecology, 2nd edn. (Lawrence Erlbaum, Hillsdale 1993)

    Google Scholar 

  21. A. Noe: Action in Perception (MIT Press, Cambridge 2004)

    Google Scholar 

  22. J. Pearl: Causality: Models, Reasoning and Inference (Cambridge Univ. Press, Cambridge 2009)

    Book  MATH  Google Scholar 

  23. V.A.F. Lamme: Why visual attention and awareness are different, Trends Cogn. Sci. 7, 12–18 (2003)

    Article  Google Scholar 

  24. V.A.F. Lamme: Independent neural definitions of visual awareness and attention. In: The Cognitive Penetrability of Perception: An Interdisciplinary Approach, ed. by A. Raftopoulos (NovaScience Books, Hauppauge 2004)

    Google Scholar 

  25. A. Clark: An embodied cognitive science?, Trends Cogn. Sci. 3(9), 345–351 (1999)

    Article  Google Scholar 

  26. P. Vecera: Toward a biased competition account of object-based segmentation and attention, Brain Mind 1, 353–384 (2000)

    Article  Google Scholar 

  27. E.C. Hildreth, S. Ulmann: The computational study of vision. In: Foundations of Cognitive Science, ed. by M.I. Posner (MIT Press, Cambridge 1989)

    Google Scholar 

  28. Z. Pylyshyn: Is vision continuous with cognition? The case for cognitive impenetrability of visual perception, Behav. Brain Sci. 22, 341–423 (1999)

    Google Scholar 

  29. M. Barr: The proactive brain: Memory for predictions, Philos. Trans. R. Soc. Lond. B Biol. Sci. 364, 1235–1243 (2009)

    Article  Google Scholar 

  30. K. Kihara, Y. Takeda: Time course of the integration of spatial frequency-based information in natural scenes, Vis. Res. 50, 2158–2162 (2010)

    Article  Google Scholar 

  31. C. Peyrin, C.M. Michel, S. Schwartz, G. Thut, M. Seghier, T. Landis, C. Marendaz, P. Vuilleumier: The neural processes and timing of top-down processes during coarse-to-fine categorization of visual scenes: A combined fMRI and ERP study, J. Cogn. Neurosci. 22, 2678–2780 (2010)

    Article  Google Scholar 

  32. A. Delorme, G.A. Rousselet, M.J.-M. Macé, M. Fabre-Thorpe: Interaction of top-down and bottom up processing in the fast visual analysis of natural scenes, Cogn. Brain Res. 19, 103–113 (2004)

    Article  Google Scholar 

  33. L. Chelazzi, E. Miller, J. Duncan, R. Desimone: A neural basis for visual search in inferior temporal cortex, Nature 363, 345–347 (1993)

    Article  Google Scholar 

  34. P.R. Roelfsema, V.A.F. Lamme, H. Spekreijse: Object-based attention in the primary visual cortex of the macaque monkey, Nature 395, 376–381 (1998)

    Article  Google Scholar 

  35. S.M. Kosslyn: Image and Brain (MIT Press, Cambridge 1994)

    Google Scholar 

  36. A. Raftopoulos: Cognition and Perception: How Do Psychology and the Neural Sciences Inform Philosophy? (MIT Press, Cambridge 2009)

    Google Scholar 

  37. T. Burge: Origins of Objectivity (Clarendon Press, Oxford 2010)

    Book  Google Scholar 

  38. P. Cavanagh: Visual cognition, Vis. Res. 51, 1538–1551 (2011)

    Article  Google Scholar 

  39. R. Gregory: Concepts and Mechanisms of Perception (Charles Scribners and Sons, New York 1974)

    Google Scholar 

  40. K. Grill-Spector, T. Kushnir, T. Hendler, S. Edelman, Y. Itzchak, R. Malach: A sequence of object-processing stages revealed by fMRI in the Human occipital lobe, Human Brain Mapping 6, 316–328 (1998)

    Article  Google Scholar 

  41. H. Liu, Y. Agam, J.R. Madsen, G. Krelman: Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex, Neuron 62, 281–290 (2009)

    Article  Google Scholar 

  42. M. Peterson: Overlapping partial configurations in object memory. In: Perception of Faces, Objects, and Scenes: Analytic and Holistic Processes, ed. by M. Peterson, G. Rhodes (Oxford Univ. Press, New York 2003)

    Google Scholar 

  43. M. Peterson, J. Enns: The edge complex: Implicit memory for figure assignment in shape perception, Percept. Psychophys. 67(4), 727–740 (2005)

    Article  Google Scholar 

  44. M. Fabre-Thorpe, A. Delorme, C. Marlot, S. Thorpe: A limit to the speed of processing in ultra-rapid visual categorization of novel natural scenes, J. Cogn. Neurosci. 13(2), 171–180 (2001)

    Article  Google Scholar 

  45. J.S. Johnson, B.A. Olshausen: The earliest EEG signatures of object recognition in a cued-target task are postesensory, J. Vis. 5, 299–312 (2005)

    Google Scholar 

  46. S. Thorpe, D. Fize, C. Marlot: Speed of processing in the human visual system, Nature 381, 520–522 (1996)

    Article  Google Scholar 

  47. S.M. Crouzet, H. Kirchner, S.J. Thorpe: Fast saccades toward faces: Face detection in just 100 ms, J. Vis. 10(4), 1–17 (2010)

    Article  Google Scholar 

  48. H. Kirchner, S.J. Thorpe: Ultra-rapid object detection with saccadic movements: Visual processing speed revisited, Vis. Res. 46, 1762–1776 (2006)

    Article  Google Scholar 

  49. K. Grill-Spector, R. Henson, A. Martin: Repetition and the brain: Neural models of stimulus-specific effects, Trends Cogn. Sci. 10, 14–23 (2006)

    Article  Google Scholar 

  50. M. Chaumon, V. Drouet, C. Tallon-Baudry: Unconscious associative memory affects visual processing before 100 ms, J. Vis. 8(3), 1–10 (2008)

    Article  Google Scholar 

  51. S. Ullman, M. Vidal-Naquet, E. Sali: Visual features of intermediate complexity and their use in classification, Nat. Neurosci. 5(7), 682–687 (2002)

    Google Scholar 

  52. V.A.F. Lamme, H. Super, R. Landman, P.R. Roelfsema, H. Spekreijse: The role of primary visual cortex (V1) in visual awareness, Vis. Res. 40(10-12), 1507–1521 (2000)

    Article  Google Scholar 

  53. R. VanRullen, S.J. Thorpe: The time course of visual processing: From early perception to decision making, J. Cogn. Neurosci. 13, 454–461 (2001)

    Article  Google Scholar 

  54. A. Torralba, A. Oliva: Statistics of natural image categories, Network 14, 391–412 (2013)

    Article  Google Scholar 

  55. R. Jackendoff: Consciousness and the Computational Mind (MIT Press, Cambridge 1989)

    Google Scholar 

  56. F. Jackson: Perception: A Representative Theory (Cambridge Univ. Press, Cambridge 1977)

    Google Scholar 

  57. F. Dretske: Conscious experience, Mind 102, 263–283 (1993)

    Article  Google Scholar 

  58. F. Dretske: Naturalizing the Mind (MIT Press, Cambridge 1995)

    Google Scholar 

  59. S.E. Palmer: Vision Science: Photons to Phenomenology (MIT Press, Cambridge 1999)

    Google Scholar 

  60. J. McDowell: Mind and World (Harvard Univ. Press, Cambridge 2004)

    Google Scholar 

  61. A. Treisman: How the deployment of attention determines what we see, Vis. Cogn. 14, 411–443 (2006)

    Article  Google Scholar 

  62. A. Treisman, N.G. Kanwisher: Perceiving visually presented objects: Recognition, awareness, and modularity, Curr. Opin. Neurobiol. 8, 218–226 (1998)

    Article  Google Scholar 

  63. J. Perry: Knowledge, Possibility, and Consciousness, 2nd edn. (MIT Press, Cambridge 2001)

    Google Scholar 

  64. R.C. Stalnaker: Our Knowledge of the Internal World (Clarendon Press, Oxford 2008)

    Book  Google Scholar 

  65. L. Magnani: Abductive Cognition. The Epistemological and Eco-Cognitive Dimensions of Hypothetical Reasoning (Springer, Berlin 2009)

    MATH  Google Scholar 

  66. C.S. Peirce: Perceptual judgments (1902. In: Philosophical Writings of Peirce, ed. by J. Buchler (Dover, New York 1955)

    Google Scholar 

  67. C.S. Peirce, N. Houser: The Essential Peirce: Selected Philosophical Writings, Vol. 2 (Indiana Univ. Press, Bloomington 1998)

    Google Scholar 

  68. S. Toulmin: The Uses of Argument (Cambridge Univ. Press, Cambridge 1958)

    Google Scholar 

  69. D.I. Perrett, M.W. Oram, J.K. Hietanen, P.J. Benson: Issues of representation in object vision. In: The Neuropsychology of Higher Vision: Collated Tutorial Essays, ed. by M.J. Farah, G. Ratcliff (Lawrence Erlbaum, Hillsdale 1994)

    Google Scholar 

  70. E.S. Spelke, R. Kestenbaum, D.J. Simons, D. Wein: Spatio-temporal continuity, smoothness of motion and object identity in infancy, Br. J. Dev. Psychol. 13, 113–142 (1995)

    Article  Google Scholar 

  71. E.S. Spelke: Principles of object perception, Cogn. Sci. 14, 29–56 (1990)

    Article  Google Scholar 

  72. A. Karmiloff-Smith: Beyond Modularity: A Developmental Perspective on Cognitive Science (MIT Press, Cambridge 1992)

    Google Scholar 

  73. G.F. Poggio, W.H. Talbot: Mechanisms of static and dynamic stereopsis in foveal cortex of the rhesus monkey, J. Physiol. 315, 469–492 (1981)

    Article  Google Scholar 

  74. D. Ferster: A comparison of binocular depth mechanisms in areas 17 and 18 of the cat visual cortex, J. Physiol. 311, 623–655 (1981)

    Article  Google Scholar 

  75. J.F.W. Mayhew, J.P. Frisby: Psychophysical and computational studies towards a theory of human stereopsis, Artif. Intell. 17, 349–385 (1981)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Athanassios Raftopoulos .

Editor information

Editors and Affiliations

Appendices

Appendix: Forms of Inferences

These are the three forms of inferences in which all syllogisms can be categorized.

1.1 Deduction

An inference is deductive if its logical structure is such that the conclusion of the inference is a logical consequence of the premises of the inference. This entails that if the premises of a deductive argument are true then its conclusion is necessarily true as well. In this sense, deductive arguments are truth preserving. This is equivalent to saying that in any interpretation of the inference in which the premises are true, the conclusion is true too. Differently put, if an argument is deductively valid, there is no model under which the premises are true but the conclusion is false. This is why deductive inferences are sometimes characterized as conclusive.

A typical example of a deductive argument is this: All men are mortal; Socrates is a man. Therefore Socrates is mortal.

1.2 Induction

An argument is inductive if its conclusion does not follow logically from the premises. The premises of an inductive argument may be true and still its conclusion false. The premises of an inductive argument provide epistemic support or epistemic warrant for its conclusion; they constitute evidence for the conclusion. By definition, inductive arguments are not truth preserving.

A typical example of an inductive argument is the following: Bird α is a crow and is black; bird β is a crow and is black; … bird κ is a crow and is black. Therefore: All crows are probably black.

If the examined specimens are found in a variety of places and under different environmental conditions, the premises of the inference provide solid evidence for the conclusion. Yet, the conclusion may still be wrong since the next crow that we will examine may not be black. This example shows that the conclusion does not follow logically from the premises. It is still possible, no matter how good the premises, that is the evidence, are that the conclusion be false, which explains the qualification probably in the conclusion of an inductive argument. The world could be such that even crows α through κ are black, crow κ + 1 is white. For this reason inductions are considered to be nonconclusive but tentative [26.68].

1.3 Abduction or Inference to the Best Explanation

It is an inference in which a series of facts, which are either new, or improbable, or surprising on their own or in conjunction, are used as premises leading to a conclusion that constitutes an explanation of these facts. This explanation makes them more probable and more comprehensible in that it accounts for their appearance. As such, with abductive inferences the mind reaches conclusions that go far beyond what is given. For this reason, abductions are the main theoretical tools for building models and theories that explain reality. Abduction is inductive since it is ampliative, does not preserve truth and is thus probabilistic in that the conclusion is tentative.

1.4 Differences Between the Modes of Inference

1.4.1 Induction versus Deduction

Induction is an ampliative inference, whereas deduction is not ampliative. This means that the information conveyed by the conclusion of an inductive argument goes beyond the information conveyed by the premises and, in this sense, the conclusion is not implicitly contained in the premises. In deduction, the conclusion is implicitly contained in the premises and the inference just makes it explicit. If all men are mortal and Socrates is a man, for example, the fact that Socrates is mortal is implicitly contained in these two propositions. What the deduction does is to render it explicit in the form of the conclusion. When we deduce that Socrates is mortal, our knowledge does not extend that which we already knew; it only makes it explicit. When, on the other hand, we inductively infer that all crows are probably black from the premise that all the specimens of crows that we have examined thus far are black, we extend the scope of our knowledge because the conclusion concerns all crows and not just the crows thus far examined.

The above discussion entails the main difference between deductive and inductive arguments. Deductive arguments are monotonous, while inductive arguments are not. This means that a valid deductive argument remains valid no matter how many premises we add to the argument. The reason is that the validity of the deductive argument presupposes that the conclusion is a logical conclusion of its premises. This fact does not change by the addition of new premises, no matter what these premises stipulate and thus the deductive argument remains valid. Things are radically different in induction . A new premise may change the conclusion even if the previous premises strongly supported the conclusion. For example, if we discover that crow κ + 1 is white, this undermines that previously drawn and well-supported conclusion that all crows are black.

1.4.2 Induction versus Abduction

Both abduction and induction are tentative forms of inference in that they do not warrant the truth of their conclusion even if the premises are true. They are, also, both ampliative in that the conclusion introduces information that was not contained implicitly in the premises. As we have seen, in abduction one aims to explain or account for a set of data. Induction is a more general form of inference. When, for instance, one successfully tests a hypothesis by making predictions that are borne out, the predicted data provide inductive, but not abductive, support for the hypothesis. In general, the evaluation phase in hypothesis, or theory, construction is considered to be inductive. Conceiving the explanatory hypothesis, on the other hand, is an abductive process that may assume the form of a pure, educated guess that need not have involved any previous testing. In this case, the abductively produced hypothesis is not, a priori, the best explanation for the set of data that need explanation; this is one of the occasions in which abduction can be distinguished from the inference to the best explanation . However, it should be stressed, although I do not have the space to elaborate on this problem, that in realistic scientific practice abduction as theory construction could not be separated from the evaluative inductive phase since they both form an inextricable link. This justifies the claim that abduction is an inference to the best explanation.

A further difference between abduction and induction is that even though both kinds of inference are ampliative, in abduction the conclusion may, and usually does, contain terms that do not figure in the premises. Almost all theoretical entities in science were conceived as a result of abduction. The nucleus of an atom, for example, was posited as a way of explaining the scattering of particles after the bombardment of atoms. Nowhere in the premises of the abductive argument was the notion of an atom present; the evidence consisted in measurements of the deviation of the pathways particles from their predicted values after the bombardment. The conclusion all crows are probably black, on the other hand, contains only terms that are available in the premises.

Appendix: Constructivism

Some of Marr’s particular proposals of his model have been criticized on many grounds (see, for example, [26.59]). In particular, against Marr’s model of object recognition, it has been argued by several researchers that object recognition may be more image-based than based on object-centered representations, which means that the latter may be less important than Marr thought them to be. Neurophysiological studies [26.69] also suggest that both object-centered and viewer-centered representations play a substantial role in object recognition. Nevertheless, his general ideas about the construction of gradual visual representations remain useful. According to this form of constructivism, vision consists of four stages, each of which outputs a different kind of visual representation:

  1. 1.

    The formation of the retinal image; the immediate stimulus for vision, that is the first stimulus that affects directly the sensory organs (this is called the proximal stimulus) is the pair of two-dimensional (GlossaryTerm

    2-D

    ) images projected from the environment to the eyes. This representation is based on a 2-D retinal organization. At this stage, the information impinging on the retina (which as you may recall concerns intensity of illumination and wavelengths, and which is captured by the retinal receptors) is organized so that all of the information about the spatial distribution of light (i. e., the light intensity falling on each retinal receptor) be recast in a reference frame that consists of square image elements (pixels), each indicating with a numerical value the light intensity falling on each receptor. Sometimes, the processes of this stage are called sensation.

  2. 2.

    The image-based stage; it includes operations that receive as input the retinal image (that is, the numerical array of values of light intensities in each pixel) and process it in order to detect local edges and lines, to link these edges and lines in a more global scale, to match up corresponding images in the two eyes, to define 2-D regions in the image, and to detect line terminations and blobs. This stage outputs 2-D surfaces at some particular slant that are located at some distance from the viewer in 3-D space.

In general, the image-based representation has the following properties: First, it receives as input and thus operates first on information about the 2-D structure of the retinal image rather than on information concerning the physical, distal, objects. Second, its geometry is inherently two-dimensional. Third, the image-based representation of the 2-D features is cast in a coordinate reference system that is defined with respect to the retina (as a result, the organization of the information is called retino-topic). This means that the axes of the reference system are aligned with the eye rather than the body or the environment. This stage is the first stage of perception proper:

  1. 1.

    The surface-based; in this stage, vision constructs representations of the intrinsic properties of surfaces in the environment that might have produced the features constructed in the image-based model. At this stage, and in contradistinction to the preceding stage, the information about the worldly surfaces is represented in three dimensions. Marr’s two-and-a-half-dimensional (GlossaryTerm

    2.5-D

    ) sketch is a typical example of a surface-based representation. Note that the surface-based representation of a visual scene does not contain information about all the surfaces that are present in the scene, but only those that are visible for the viewer’s current viewpoint.

In general, the surface-based representation has the following properties: First, The elements that the surface-based stage outputs consist of the output of the image-based stage, that is, in 2-D surfaces at some particular slant that are located at some distance from the viewer in 3-D space. Second, these 2-D surfaces are represented within a 3-D spatial framework. Third, the aforementioned reference framework is defined in terms of the direction and distance of the surfaces from the observer’s standpoint (it is egocentric):

  1. 1.

    The object-based; this is the stage in which the visual system constructs 3-D representations of objects that include at least some of the occluded surfaces of the objects, that is, the surfaces that are invisible from the standpoint of the viewer, such as the back parts of objects. In this sense, this is the stage in which explicit representations of whole objects in the environment are constructed. It goes without saying that in order for the visual system to achieve this aim, it must use information about the whole objects that viewers have stored from their previous visual encounters with objects of the same type. The viewer retrieves from memory this information and fills in with it the surface-based image constructed at the previous stage.

In general, the object-based representation has the following properties: First, this stage outputs volumetric representations of objects that may include information about unseen surfaces. Second, the space in which these objects are represented is three-dimensional. Third, the frame of reference in which the object-based representations are cast is defined in terms of the intrinsic structure of the objects and the visual scene (it is scene-based or allocentric).

Appendix: Bayes’ Theorem and Some of Its Epistemological Aspects

Bayes’ theorem is the following probabilistic formula (in its simple form because there is another formulation when one considers two competing hypotheses), where A is a hypothesis purporting to explain a set of data B

$$P(A/B)=P(B/A)P(A)/P(B)\;,$$

where P(A) is the prior probability, that is, the initial degree of belief in A; P(A/B) is the conditional probability of A given B, or posterior probability, that is, the degree of belief in A after taking into consideration B; P(B) is the probability of B. P(B/A) is the likelihood of B given A, that is, the degree of belief that B is true given that A is true. The ratio \(P(B/A)/P(B)\) represents the degree of support that B provides for A.

Suppose that B is the sensory information encoded by a neuronal assembly at level l-1, and A is the hypothesis that the neuronal assembly at level l posits as an explanation of B. Bayes’ theorem tells us that the probability that A is true, that is, the probability that level l represent a true pattern in the environment given the sensory data B, depends first on the prior probability of hypothesis A, that is the probability of A before the predictions of A are tested. This prior probability depends on both the incoming signal to l but, also and most crucially because many different causes could have caused the incoming signal, on the contextual effects because these are the factors that determine which is the most likely explanation of the data among the various possible alternative accounts.

The probability of A also depends on the P(B/A), that is, the probability that B be true given A. This reflects a significant epistemological insight, namely, that since a correct account of a set of data explains away, these data are a natural consequence of the explaining hypothesis, or naturally fit into the conceptual framework created by the hypothesis. The various gravity phenomena, for instance, become very plausible in view of the law of gravity; they are not so much so if the hypothesis purporting to explain these same phenomena involves some accidents of nature, even if they are systematic. To put in a reverse way, if gravity exists, then the probability that unsupported objects will fall down is greater than the probability of these objects falling down if some other hypothesis is postulated to explain the fall of unsupported objects.

The probability of the hypothesis A depends inversely on the probability of the data B. Since probabilities take values from (0 to 1), the smaller the probability in the denominator, that is, the more surprising and thus improbable B is, the greater the probability that A be true given B. This part of the equation also reflects an important epistemological insight, namely that the more surprising a set of data is, the more likely is to be true a hypothesis that successfully explains them. Finally, the ratio \(P(B/A)/P(B)\) expresses the support B provides to A in the sense that the greater this ratio, the greater the probability that the hypothesis A is true.

Appendix: Modal and Amodal Completion or Perception

There are two sorts of completion. In modal completion the viewer has a distinct visual impression of a hidden contour or other hidden features even though these features are not occurrent sensory features. The perceptual system fills in the missing features, which thus become as phenomenally occurrent as the occurrent sensory features of the object.

In amodal completion, one does not have a perceptual impression of the object’s hidden features since the perceptual system does not fill in the missing features as it happens in modal perception, although as we shall see mental imagery can fill in the missing phenomenology; the hidden features are not perceptually occurrent.

There are cases of amodal perception that are purely perceptual, that is, bottom-up. In these cases, although no direct signals from the hidden features impinge on the retina (there is no local information available), the perceptual system can extract information regarding them from the global information contained in the visual scene without any cognitive involvement, as the resistance of the ensuing percepts to beliefs indicates. However, in such cases, the hidden features are not perceived. One simply has the visual impression of a single concrete object that is partially occluded and not the visual impression of various disparate image regions. Therefore, in these perceptually driven amodal completions there is no mental imagery involved, since no top-down signals from cognitive areas are required for the completion, and since the hidden features are not phenomenologically present.

There are also cases of amodal completion that are cognitively driven, such as the formation of the 3-D sketch of an object, in which the hidden features of the object are represented through the top-down activation of the visual cortex from the cognitive centers of the brain. In some of these cases, top-down processes activate the early visual areas and fill in the missing features that become phenomenologically present. In other cases of cognitively driven amodal completion, the viewer simply forms a pure thought concerning the hidden structure in the absence of any activation of the visual areas and thus in the absence of mental imagery.

Appendix: Operational Constraints in Visual Processing

Studies by [26.3, 26.70, 26.71, 26.72] show that infants, almost from the very beginning, are constrained by a number of domain-specific principles about material objects and some of their properties. As Karmiloff-Smith [26.72] remarks, these constraints involve “attention biases toward particular inputs and a certain number of principled predispositions constraining the computation of those inputs”. Such predispositions are the conception of object persistence, and four basic principles (boundness, cohesion, rigidity, and no action at a distance).

The cohesion principle: “two surface points lie on the same object only if the points are linked by a path of connected surface points”. This entails that if some relative motion alters the adjacency relations among points at their borders, the surfaces lie on distinct objects, and that “all points on an object move on connected paths over space and time. When surface points appear at different places and times such that no connected path could unite their appearances, the surface points do not lie on the same object”.

According to the boundness principle “two surface points lie on distinct objects only if no path of connected surface points links them”. This principle determines the set of those points that define an object boundary and entails that two distinct objects cannot interpenetrate, because two distinct bodies cannot occupy the same place at the same time.

Finally the rigidity and no action at a distance principles specify that bodies move rigidly (unless the other mechanisms show that a seemingly unique body is, in fact, a set of two distinct bodies) and that they move independently of one another (unless the mechanisms show that two seemingly separate objects are in fact connected).

Further studies shed light on the nature of these principles or constraints and on the neuronal mechanisms that may realize them. There is evidence that the physiological mechanisms underlying vision reflect these constraints; their physical making is such that they implement these constraints, from cells for edge detection to mechanisms implementing the epipolar constraint [26.73, 26.74]. Thus, one might claim that these principles are hardwired in our perceptual system.

The formation of the full primal sketch in Marr’s [26.12] theory relies upon the principles of local proximity (adjacent elements are combined) and of similarity (similarly oriented elements are combined). It also relies upon [26.20] the more general principle of closure (two edge-segments could be joined even though their contrasts differ because of illumination effects).

Other principles used by early visual processing to solve the problem of the underdetermination of perception by the retinal image are those of continuity (the shapes of natural objects tend to vary smoothly and usually do not have abrupt discontinuities), proximity (since matter is cohesive, adjacent regions usually belong together and remain so even when the object moves), and similarity (since the same kind of surface absorbs and reflects light in the same way the different subregions of an object are likely to look similar).

The formation of the \(2\frac{1}{2}\)D sketch is similarly underdetermined, in that there is a great deal of ambiguity in matching features between the two images form in the retinas of the two eyes, since there is usually more than one possible match. Stereopsis requires a unique matching, which means that the matching processing must be constrained. The formation of the \(2\frac{1}{2}\)D sketch, therefore, relies upon a different set of operational constraints that guide stereopsis. “A given point on a physical surface has a unique position in space at some time” [26.69] and matter is cohesive and surfaces are generally smooth. These operational constraints give rise to the general constraints of compatibility (a pair of image elements are matched together if they are physically similar, since they originate from the same point of the surface of an object), of uniqueness (an item from one image matches with only one item from the other image), and of continuity (disparities must vary smoothly). Another constraint posited by all models of stereopsis is the epipolar constraint (the viewing geometry is known). Mayhew and Frisby’s [26.75] account of stereopsis posits some additional constraints, most notably, the principle of figural continuity, according to which figural relationships are used to eliminate most of alternative candidate matches between the two images.

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Raftopoulos, A. (2017). Vision, Thinking, and Model-Based Inferences. In: Magnani, L., Bertolotti, T. (eds) Springer Handbook of Model-Based Science. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-319-30526-4_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30526-4_26

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30525-7

  • Online ISBN: 978-3-319-30526-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics