Throwing Down the Visual Intelligence Gauntlet

Tan, Cheston; Leibo, Joel Z.; Poggio, Tomaso

doi:10.1007/978-3-642-28661-2_1

Cheston Tan⁴,
Joel Z. Leibo⁴ &
Tomaso Poggio⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 411))

5895 Accesses
2 Citations

Abstract

In recent years, scientific and technological advances have produced artificial systems that have matched or surpassed human capabilities in narrow domains such as face detection and optical character recognition. However, the problem of producing truly intelligent machines still remains far from being solved. In this chapter, we first describe some of these recent advances, and then review one approach to moving beyond these limited successes – the neuromorphic approach of studying and reverse-engineering the networks of neurons in the human brain (specifically, the visual system). Finally, we discuss several possible future directions in the quest for visual intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A pedestrian detection system that stops a car automatically, http://articles.economictimes.indiatimes.com/2011-02-27/news/28638493_1_detection-system-volvo-collision-warning-system
Caltech 101, http://www.vision.caltech.edu/Image_Datasets/Caltech101/
DARPA Grand Challenge, http://en.wikipedia.org/wiki/DARPA_Grand_Challenge
Digital Camera Face Recognition: How It Works, http://www.popularmechanics.com/technology/how-to/4218937
HomeCageScan 2.0, http://www.cleversysinc.com/products/software/homecagescan
Night View Assist: How night becomes day., http://www.daimler.com/dccom/0-5-1210218-1-1210320-1-0-0-1210228-0-0-135-7165-0-0-0-0-0-0-0.html
The MIT Intelligence Initiative, http://isquared.mit.edu/
The PASCAL Visual Object Classes Homepage, http://pascallin.ecs.soton.ac.uk/challenges/VOC/
USPS Awards Parascript Contract for OCR to Support Automated Parcel Bundle Sorting at USPS Facilities throughout the United States, http://money.msn.com/business-news/article.aspx?feed=PR&Date=20110601&ID=13713512/
Amit, Y., Mascaro, M.: An integrated network for invariant visual detection and recognition. Vision Research 43(19), 2073–2088 (2003), http://dx.doi.org/10.1016/S0042-69890300306-7 , doi:10.1016/S0042-6989(03)00306-7
Article Google Scholar
Anzai, A., Peng, X., Essen, D.V.: Neurons in monkey visual area V2 encode combinations of orientations. Nature Neuroscience 10(10), 1313–1321 (2007), http://www.nature.com/neuro/journal/vaop/ncurrent/full/nn1975.html
Article Google Scholar
Cadieu, C., Kouh, M., Pasupathy, A., Connor, C., Riesenhuber, M., Poggio, T.: A model of V4 shape selectivity and invariance. Journal of Neurophysiology 98(3), 1733 (2007), http://jn.physiology.org/content/98/3/1733.short
Article Google Scholar
Fukushima, K.: Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics 36(4), 193–202 (1980), http://www.springerlink.com/content/r6g5w3tt54528137 , doi:10.1007/BF00344251
Article MATH Google Scholar
Gawne, T.J., Martin, J.M.: Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. Journal of Neurophysiology 88(3), 1128 (2002), http://jn.physiology.org/content/88/3/1128.short
Google Scholar
Hubel, D., Wiesel, T.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology 160(1), 106 (1962), http://jp.physoc.org/content/160/1/106.full.pdf
Google Scholar
Hung, C.P., Kreiman, G., Poggio, T., DiCarlo, J.J.: Fast Readout of Object Identity from Macaque Inferior Temporal Cortex. Science 310(5749), 863–866 (2005), http://www.sciencemag.org/cgi/content/abstract/310/5749/863 , doi:10.1126/science.1117593
Article Google Scholar
Jhuang, H., Garrote, E., Yu, X., Khilnani, V., Poggio, T., Steele, A., Serre, T.: Automated home-cage behavioural phenotyping of mice. Nature Communications 1(6), 1–9 (2010), http://www.nature.com/ncomms/journal/v1/n6/abs/ncomms1064.html
Article Google Scholar
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: International Conference on Computer Vision (ICCV), vol. 11, pp. 1–8 (2007), http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4408988
Keysers, C., Xiao, D., Földiák, P., Perrett, D.: The speed of sight. Journal of Cognitive Neuroscience 13(1), 90–101 (2001), http://www.mitpressjournals.org/doi/abs/10.1162/089892901564199
Article Google Scholar
Lampl, I., Ferster, D.: Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. Journal of Neurophysiology 92(5), 2704 (2004), http://jn.physiology.org/content/92/5/2704.short
Article Google Scholar
Li, F., VanRullen, R., Koch, C., Perona, P.: Rapid natural scene categorization in the near absence of attention. Proceedings of the National Academy of Sciences of the United States of America 99(14), 9596 (2002), http://www.pnas.org/content/99/14/9596.short
Article Google Scholar
Mel, B.W.: SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition. Neural Computation 9(4), 777–804 (1997), http://dx.doi.org/10.1162/neco.1997.9.4.777 , http://www.mitpressjournals.org/doi/abs/10.1162/neco.1997.9.4.777 doi:10.1162/neco.1997.9.4.777
Article Google Scholar
Mishkin, M., Ungerleider, L.G., Macko, K.A.: Object vision and spatial vision: Two cortical pathways. Trends in Neurosciences 6, 414–417 (1983)
Article Google Scholar
Mutch, J., Lowe, D.: Multiclass Object Recognition with Sparse, Localized Features. In: 2006 IEEE Conference on Computer Vision and Pattern Recognition, pp. 11–18. IEEE (2006), http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1640736 , doi:10.1109/CVPR.2006.200
Perrett, D., Oram, M.: Neurophysiology of shape processing. Image and Vision Computing 11(6), 317–333 (1993), http://linkinghub.elsevier.com/retrieve/pii/0262885693900115
Article Google Scholar
Pinto, N., DiCarlo, J.J., Cox, D.D.: Establishing Good Benchmarks and Baselines for Face Recognition. In: IEEE European Conference on Computer Vision, Faces in ’Real-Life’ Images Workshop (2008), http://hal.archives-ouvertes.fr/inria-00326732/
Pinto, N., DiCarlo, J.J., Cox, D.D.: How far can you get with a modern face recognition test set using only simple features? In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2591–2598. IEEE (2009), http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5206605 , doi:10.1109/CVPR.2009.5206605
Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neuroscience 2(11), 1019–1025 (1999), doi:10.1038/14819
Article Google Scholar
Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., Poggio, T.: A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex. CBCL Paper #259/AI Memo #2005-036 (2005), http://en.scientificcommons.org/21119952
Serre, T., Oliva, A., Poggio, T.: A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Sciences of the United States of America 104(15), 6424–6429 (2007), http://cat.inist.fr/?aModele=afficheN&cpsidt=18713198
Article Google Scholar
Serre, T., Poggio, T.: A neuromorphic approach to computer vision. Communications of the ACM 53(10), 54–61 (2010), http://portal.acm.org/citation.cfm?id=1831425
Article Google Scholar
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Robust Object Recognition with Cortex-Like Mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 411–426 (2007), http://portal.acm.org/citation.cfm?id=1263421&dl=
Article Google Scholar
Thorpe, S., Fabre-Thorpe, M.: Seeking categories in the brain. Science 291(5502), 260 (2001), http://www.sciencemag.org/content/291/5502/260.short
Article Google Scholar
Thorpe, S., Fize, D., Marlot, C.: Speed of processing in the human visual system. Nature 381(6582), 520–522 (1996), http://www.ncbi.nlm.nih.gov/pubmed/8632824 , doi:10.1038/381520a0
Article Google Scholar
Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
Article MathSciNet Google Scholar
VanRullen, R., Koch, C.: Visual selective behavior can be triggered by a feed-forward process. Journal of Cognitive Neuroscience 15(2), 209–217 (2003), http://www.mitpressjournals.org/doi/abs/10.1162/089892903321208141
Article Google Scholar
Wallis, G., Rolls, E.T.: A model of invariant object recognition in the visual system. Progress in Neurobiology 51, 167–194 (1997), http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.48.880&rep=rep1&type=pdf
Article Google Scholar
Wallisch, P., Movshon, J.: Structure and Function Come Unglued in the Visual Cortex. Neuron 60(2), 195–197 (2008), http://linkinghub.elsevier.com/retrieve/pii/s0896-6273%2808%2900851-9
Article Google Scholar
Wersing, H., Körner, E.: Learning optimized features for hierarchical models of invariant object recognition. Neural Computation 15(7), 1559–1588 (2003), http://www.mitpressjournals.org/doi/abs/10.1162/089976603321891800 , doi:10.1162/089976603321891800
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, UK
Cheston Tan, Joel Z. Leibo & Tomaso Poggio

Authors

Cheston Tan
View author publications
You can also search for this author in PubMed Google Scholar
Joel Z. Leibo
View author publications
You can also search for this author in PubMed Google Scholar
Tomaso Poggio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheston Tan .

Editor information

Editors and Affiliations

Department of Engineering, University of Cambridge, Trumpington Street, Cambridge, CB2 1PZ, United Kingdom
Roberto Cipolla
Dipartimento di Matematica e Informatica, Università di Catania, Viale Andrea Doria 6, Catania, 95125, Catania, Italy
Sebastiano Battiato
Dipartimento di Matematica e Informatica, Università di Catania, Viale A. Doria 6, Catania, 95125, Italy
Giovanni Maria Farinella

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tan, C., Leibo, J.Z., Poggio, T. (2013). Throwing Down the Visual Intelligence Gauntlet. In: Cipolla, R., Battiato, S., Farinella, G. (eds) Machine Learning for Computer Vision. Studies in Computational Intelligence, vol 411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28661-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-28661-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28660-5
Online ISBN: 978-3-642-28661-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics