Advertisement

Biological Cybernetics

, Volume 93, Issue 1, pp 79–90 | Cite as

Learning viewpoint invariant object representations using a temporal coherence principle

  • Wolfgang EinhäuserEmail author
  • Jörg Hipp
  • Julian Eggert
  • Edgar Körner
  • Peter König
Article

Abstract

Invariant object recognition is arguably one of the major challenges for contemporary machine vision systems. In contrast, the mammalian visual system performs this task virtually effortlessly. How can we exploit our knowledge on the biological system to improve artificial systems? Our understanding of the mammalian early visual system has been augmented by the discovery that general coding principles could explain many aspects of neuronal response properties. How can such schemes be transferred to system level performance? In the present study we train cells on a particular variant of the general principle of temporal coherence, the “stability” objective. These cells are trained on unlabeled real-world images without a teaching signal. We show that after training, the cells form a representation that is largely independent of the viewpoint from which the stimulus is looked at. This finding includes generalization to previously unseen viewpoints. The achieved representation is better suited for view-point invariant object classification than the cells’ input patterns. This property to facilitate view-point invariant classification is maintained even if training and classification take place in the presence of an – also unlabeled – distractor object. In summary, here we show that unsupervised learning using a general coding principle facilitates the classification of real-world objects, that are not segmented from the background and undergo complex, non-isomorphic, transformations.

Keywords

Machine Vision Object Representation Unsupervised Learning Temporal Coherence Machine Vision System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berkes P,Wiskott L (2003) Slowfeature analysis yields a rich repertoire of complex-cell properties. Cognit Sci EPrintArch (CogPrints) 2804, http://cogprints.ecs.soton.ac.uk/archive/00002804/Google Scholar
  2. Betsch, BY, Einhäuser, W, Körding, KP, König, P 2004The world from a cat’s perspective – statistics of natural videosBiol Cybern904150CrossRefPubMedGoogle Scholar
  3. Biederman, I 1987Recognition-by-components: a theory of human image understandingPsychol Rev94115147CrossRefPubMedGoogle Scholar
  4. Biederman, I 2000Recognizing depth-rotated objects: a review of recent research and theorySpat Vis13241253CrossRefPubMedGoogle Scholar
  5. Desimone, R, Duncan, J 1995Neural mechanisms of selective visual attentionAnnu Rev Neurosci18193222CrossRefPubMedGoogle Scholar
  6. Einhäuser, W, Kayser, C, König, P, Körding, KP 2002Learning the invariance properties of complex cells from their responses to natural stimuliEur J Neurosci15475486CrossRefPubMedGoogle Scholar
  7. Einhäuser, W, Kayser, C, Körding, KP, König, P 2003Learning distinct and complementary feature-selectivities from natural colour videosRev Neurosci144352PubMedGoogle Scholar
  8. Földiak, P 1991Learning Invariance from Transformation SequencesNeural Comput3194200Google Scholar
  9. Franzius M, Einhäuser W, König P, Körding KP (2005) Learning a hierarchical model of cortical function from natural stimuli. (submitted).Google Scholar
  10. Hubel, DH, Wiesel, TN 1962Receptive fields, binocular interaction and functional architecture in the cat’s visual cortexJ Physiol160106154PubMedGoogle Scholar
  11. Hurri, J, Hyvärinen, A 2003Simple-Cell-Like Receptive Fields Maximize Temporal Coherence in Natural VideoNeural Comput15663691CrossRefPubMedGoogle Scholar
  12. Kayser, C, Einhäuser, W, Dümmer, O, König, P, Körding, KP 2001Extracting slow subspaces from natural videos leads to complex cellsDorffner, GBischoff, HHornik, K eds. Artificial neural networks – (ICANN) LNCS 2130SpringerBerlin Heidelberg New York10751080Google Scholar
  13. Kayser, C, Einhäuser, W, König, P 2003aTemporal correlations of orientations in natural scenesNeurocomputing52117123Google Scholar
  14. Kayser, C, Körding, KP, König, P 2003bLearning the nonlinearity of neurons from natural visual stimuliNeural Comput1517511759CrossRefGoogle Scholar
  15. Körding, KP, Kayser, C, Einhäuser, W, König, P 2004How are complex cell properties adapted to the statistics of natural stimuli?J Neurophysiol91206212CrossRefPubMedGoogle Scholar
  16. Mel, BW 1997SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognitionNeural Comput9777804PubMedGoogle Scholar
  17. Nayer SK, Nene SA, Murase H (1996) Real Time 100 object recognition system. In: Proceedings of ARPA Image UnderstandingWorkshop. Morgan Kaufmann, San MatteoGoogle Scholar
  18. Olshausen, BA 2002Principles of image representation in visual cortexChalupa, LMWerner, JS eds. The visual neurosciencesMIT PressCambridgeGoogle Scholar
  19. Poggio, T, Edelman, S 1990A network that learns to recognize three-dimensional objectsNature343263266CrossRefPubMedGoogle Scholar
  20. Rolls, ET, Milward, T 2000A model of invariant object recognition in the visual system: learning rules, activation functions, lateral inhibition, and information-based performance measuresNeural Comput1225472572CrossRefPubMedGoogle Scholar
  21. Stone, JV 1996Learning perceptually salient visual parameters using spatiotemporal smoothness constraintsNeural Comput814631492PubMedGoogle Scholar
  22. Stringer, SM, Rolls, ET 2002Invariant object recognition in the visual system with novel views of 3D objectsNeural Comput1425852596CrossRefPubMedGoogle Scholar
  23. Tarr, MJ, Pinker, S 1989Mental rotation and orientation-dependence in shape recognitionCognit Psychol21233282CrossRefPubMedGoogle Scholar
  24. Tarr, MJ, Bülthoff, HH 1998Image-based object recognition in man, monkey and machineCognition67120CrossRefPubMedGoogle Scholar
  25. Touryan, J, Lau, B, Dan, Y 2002Isolation of relevant visual features from random stimuli for cortical complex cellsJ Neurosci221081110818PubMedGoogle Scholar
  26. Ullman, S, Basri, R 1991Recognition by linear combinations of modelsIEEE Trans Pattern Anal Mach Intell139921006CrossRefGoogle Scholar
  27. Wallis, G, Rolls, ET 1997Invariant face and object recognition in the visual systemsProg Neurobiol51167194CrossRefPubMedGoogle Scholar
  28. Wersing, H, Körner, E 2003Learning optimized features for hierarchical models of invariant object recognitionNeural Comput1515591588CrossRefPubMedGoogle Scholar
  29. Wiskott, L, Sejnowski, T 2002Slow feature analysis: unsupervised learning of invariancesNeural Comput14715770CrossRefPubMedGoogle Scholar
  30. Wiskott, L 2003Slow feature analysis: a theoretical analysis of optimal free responsesNeural Comput1521472177CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag 2005

Authors and Affiliations

  • Wolfgang Einhäuser
    • 1
    • 4
    Email author
  • Jörg Hipp
    • 1
  • Julian Eggert
    • 2
  • Edgar Körner
    • 2
  • Peter König
    • 3
  1. 1.Institute of NeuroinformaticsUniversity & ETH ZürichZürichSwitzerland
  2. 2.HONDA Research Institute Europe GmbHOffenbach/MainGermany
  3. 3.Institute of Cognitive Science, Department NeurobiopsychologyUniversity of OsnabrückOsnabrückGermany
  4. 4.California Institute of TechnologyDivision of BiologyPasadenaUSA

Personalised recommendations