Skip to main content

Throwing Down the Visual Intelligence Gauntlet

  • Chapter
Machine Learning for Computer Vision

Part of the book series: Studies in Computational Intelligence ((SCI,volume 411))

Abstract

In recent years, scientific and technological advances have produced artificial systems that have matched or surpassed human capabilities in narrow domains such as face detection and optical character recognition. However, the problem of producing truly intelligent machines still remains far from being solved. In this chapter, we first describe some of these recent advances, and then review one approach to moving beyond these limited successes – the neuromorphic approach of studying and reverse-engineering the networks of neurons in the human brain (specifically, the visual system). Finally, we discuss several possible future directions in the quest for visual intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A pedestrian detection system that stops a car automatically, http://articles.economictimes.indiatimes.com/2011-02-27/news/28638493_1_detection-system-volvo-collision-warning-system

  2. Caltech 101, http://www.vision.caltech.edu/Image_Datasets/Caltech101/

  3. DARPA Grand Challenge, http://en.wikipedia.org/wiki/DARPA_Grand_Challenge

  4. Digital Camera Face Recognition: How It Works, http://www.popularmechanics.com/technology/how-to/4218937

  5. HomeCageScan 2.0, http://www.cleversysinc.com/products/software/homecagescan

  6. Night View Assist: How night becomes day., http://www.daimler.com/dccom/0-5-1210218-1-1210320-1-0-0-1210228-0-0-135-7165-0-0-0-0-0-0-0.html

  7. The MIT Intelligence Initiative, http://isquared.mit.edu/

  8. The PASCAL Visual Object Classes Homepage, http://pascallin.ecs.soton.ac.uk/challenges/VOC/

  9. USPS Awards Parascript Contract for OCR to Support Automated Parcel Bundle Sorting at USPS Facilities throughout the United States, http://money.msn.com/business-news/article.aspx?feed=PR&Date=20110601&ID=13713512/

  10. Amit, Y., Mascaro, M.: An integrated network for invariant visual detection and recognition. Vision Research 43(19), 2073–2088 (2003), http://dx.doi.org/10.1016/S0042-69890300306-7 , doi:10.1016/S0042-6989(03)00306-7

    Article  Google Scholar 

  11. Anzai, A., Peng, X., Essen, D.V.: Neurons in monkey visual area V2 encode combinations of orientations. Nature Neuroscience 10(10), 1313–1321 (2007), http://www.nature.com/neuro/journal/vaop/ncurrent/full/nn1975.html

    Article  Google Scholar 

  12. Cadieu, C., Kouh, M., Pasupathy, A., Connor, C., Riesenhuber, M., Poggio, T.: A model of V4 shape selectivity and invariance. Journal of Neurophysiology 98(3), 1733 (2007), http://jn.physiology.org/content/98/3/1733.short

    Article  Google Scholar 

  13. Fukushima, K.: Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics 36(4), 193–202 (1980), http://www.springerlink.com/content/r6g5w3tt54528137 , doi:10.1007/BF00344251

    Article  MATH  Google Scholar 

  14. Gawne, T.J., Martin, J.M.: Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. Journal of Neurophysiology 88(3), 1128 (2002), http://jn.physiology.org/content/88/3/1128.short

    Google Scholar 

  15. Hubel, D., Wiesel, T.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology 160(1), 106 (1962), http://jp.physoc.org/content/160/1/106.full.pdf

    Google Scholar 

  16. Hung, C.P., Kreiman, G., Poggio, T., DiCarlo, J.J.: Fast Readout of Object Identity from Macaque Inferior Temporal Cortex. Science 310(5749), 863–866 (2005), http://www.sciencemag.org/cgi/content/abstract/310/5749/863 , doi:10.1126/science.1117593

    Article  Google Scholar 

  17. Jhuang, H., Garrote, E., Yu, X., Khilnani, V., Poggio, T., Steele, A., Serre, T.: Automated home-cage behavioural phenotyping of mice. Nature Communications 1(6), 1–9 (2010), http://www.nature.com/ncomms/journal/v1/n6/abs/ncomms1064.html

    Article  Google Scholar 

  18. Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: International Conference on Computer Vision (ICCV), vol. 11, pp. 1–8 (2007), http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4408988

  19. Keysers, C., Xiao, D., Földiák, P., Perrett, D.: The speed of sight. Journal of Cognitive Neuroscience 13(1), 90–101 (2001), http://www.mitpressjournals.org/doi/abs/10.1162/089892901564199

    Article  Google Scholar 

  20. Lampl, I., Ferster, D.: Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. Journal of Neurophysiology 92(5), 2704 (2004), http://jn.physiology.org/content/92/5/2704.short

    Article  Google Scholar 

  21. Li, F., VanRullen, R., Koch, C., Perona, P.: Rapid natural scene categorization in the near absence of attention. Proceedings of the National Academy of Sciences of the United States of America 99(14), 9596 (2002), http://www.pnas.org/content/99/14/9596.short

    Article  Google Scholar 

  22. Mel, B.W.: SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition. Neural Computation 9(4), 777–804 (1997), http://dx.doi.org/10.1162/neco.1997.9.4.777 , http://www.mitpressjournals.org/doi/abs/10.1162/neco.1997.9.4.777 doi:10.1162/neco.1997.9.4.777

    Article  Google Scholar 

  23. Mishkin, M., Ungerleider, L.G., Macko, K.A.: Object vision and spatial vision: Two cortical pathways. Trends in Neurosciences 6, 414–417 (1983)

    Article  Google Scholar 

  24. Mutch, J., Lowe, D.: Multiclass Object Recognition with Sparse, Localized Features. In: 2006 IEEE Conference on Computer Vision and Pattern Recognition, pp. 11–18. IEEE (2006), http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1640736 , doi:10.1109/CVPR.2006.200

  25. Perrett, D., Oram, M.: Neurophysiology of shape processing. Image and Vision Computing 11(6), 317–333 (1993), http://linkinghub.elsevier.com/retrieve/pii/0262885693900115

    Article  Google Scholar 

  26. Pinto, N., DiCarlo, J.J., Cox, D.D.: Establishing Good Benchmarks and Baselines for Face Recognition. In: IEEE European Conference on Computer Vision, Faces in ’Real-Life’ Images Workshop (2008), http://hal.archives-ouvertes.fr/inria-00326732/

  27. Pinto, N., DiCarlo, J.J., Cox, D.D.: How far can you get with a modern face recognition test set using only simple features? In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2591–2598. IEEE (2009), http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5206605 , doi:10.1109/CVPR.2009.5206605

  28. Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neuroscience 2(11), 1019–1025 (1999), doi:10.1038/14819

    Article  Google Scholar 

  29. Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., Poggio, T.: A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex. CBCL Paper #259/AI Memo #2005-036 (2005), http://en.scientificcommons.org/21119952

  30. Serre, T., Oliva, A., Poggio, T.: A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Sciences of the United States of America 104(15), 6424–6429 (2007), http://cat.inist.fr/?aModele=afficheN&cpsidt=18713198

    Article  Google Scholar 

  31. Serre, T., Poggio, T.: A neuromorphic approach to computer vision. Communications of the ACM 53(10), 54–61 (2010), http://portal.acm.org/citation.cfm?id=1831425

    Article  Google Scholar 

  32. Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Robust Object Recognition with Cortex-Like Mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 411–426 (2007), http://portal.acm.org/citation.cfm?id=1263421&dl=

    Article  Google Scholar 

  33. Thorpe, S., Fabre-Thorpe, M.: Seeking categories in the brain. Science 291(5502), 260 (2001), http://www.sciencemag.org/content/291/5502/260.short

    Article  Google Scholar 

  34. Thorpe, S., Fize, D., Marlot, C.: Speed of processing in the human visual system. Nature 381(6582), 520–522 (1996), http://www.ncbi.nlm.nih.gov/pubmed/8632824 , doi:10.1038/381520a0

    Article  Google Scholar 

  35. Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)

    Article  MathSciNet  Google Scholar 

  36. VanRullen, R., Koch, C.: Visual selective behavior can be triggered by a feed-forward process. Journal of Cognitive Neuroscience 15(2), 209–217 (2003), http://www.mitpressjournals.org/doi/abs/10.1162/089892903321208141

    Article  Google Scholar 

  37. Wallis, G., Rolls, E.T.: A model of invariant object recognition in the visual system. Progress in Neurobiology 51, 167–194 (1997), http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.48.880&rep=rep1&type=pdf

    Article  Google Scholar 

  38. Wallisch, P., Movshon, J.: Structure and Function Come Unglued in the Visual Cortex. Neuron 60(2), 195–197 (2008), http://linkinghub.elsevier.com/retrieve/pii/s0896-6273%2808%2900851-9

    Article  Google Scholar 

  39. Wersing, H., Körner, E.: Learning optimized features for hierarchical models of invariant object recognition. Neural Computation 15(7), 1559–1588 (2003), http://www.mitpressjournals.org/doi/abs/10.1162/089976603321891800 , doi:10.1162/089976603321891800

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheston Tan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Berlin Heidelberg

About this chapter

Cite this chapter

Tan, C., Leibo, J.Z., Poggio, T. (2013). Throwing Down the Visual Intelligence Gauntlet. In: Cipolla, R., Battiato, S., Farinella, G. (eds) Machine Learning for Computer Vision. Studies in Computational Intelligence, vol 411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28661-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28661-2_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28660-5

  • Online ISBN: 978-3-642-28661-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics