An information theory perspective on computational vision

Yuille, Alan

doi:10.1007/s11460-010-0107-x

An information theory perspective on computational vision

Research Article
Published: 05 August 2010

Volume 5, pages 329–346, (2010)
Cite this article

Frontiers of Electrical and Electronic Engineering in China

Alan Yuille¹

60 Accesses
1 Citation
Explore all metrics

Abstract

This paper introduces computer vision from an information theory perspective. We discuss how vision can be thought of as a decoding problem where the goal is to find the most efficient encoding of the visual scene. This requires probabilistic models which are capable of capturing the complexity and ambiguities of natural images. We start by describing classic Markov Random Field (MRF) models of images. We stress the importance of having efficient inference and learning algorithms for these models and emphasize those approaches which use concepts from information theory. Next we introduce more powerful image models that have recently been developed and which are better able to deal with the complexities of natural images. These models use stochastic grammars and hierarchical representations. They are trained using images from increasingly large databases. Finally, we described how techniques from information theory can be used to analyze vision models and measure the effectiveness of different visual cues.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic Methods for Image Analysis

Basic Models of Descriptive Image Analysis

Tree-based iterated local search for Markov random fields with applications in image analysis

Article 20 November 2014

References

Barlow H B. The absolute efficiency of perceptual decisions. Philosophical Transactions of the Royal Society of London (Series B), 1980, 290(1038): 71–82
Article Google Scholar
Amari S. Differential geometry of curved exponential families — curvature and information loss. Annals of Statistics, 1982, 10(2): 357–385
Article MATH MathSciNet Google Scholar
Amari S. Information geometry and its applications: Convex function and dually flat manifold. In: Proceedings of Emerging Trends in Visual Computing. Lecture Notes in Computer Science, 2009, 5416: 75–102
Article Google Scholar
Xu L. Bayesian Ying-Yang machine, clustering and number of clusters. Pattern Recognition Letters, 1997, 18(11–13): 1167–1178
Article Google Scholar
Escolano F, Suau P, Bonev B. Information Theory in Computer Vision and Pattern Recognition. Springer, 2009
Shannon C E. A mathematical theory of communication. Bell System Technical Journal, 1948, 27: 379–423, 623–656
MATH MathSciNet Google Scholar
Cover T M, Thomas J A. Elements of Information Theory. New York: Wiley-Interscience, 1991
Book MATH Google Scholar
Kanizsa G. Organization in Vision. New York: Praeger, 1979
Google Scholar
Gregory R L. The Intelligent Eye. London: Weidenfeld and Nicolson, 1970
Google Scholar
Lee T S, Mumford D. Hierarchical Bayesian inference in the visual cortex. Journal of the Optical Society of America A, 2003, 20(7): 1434–1448
Article Google Scholar
Atick J J, Redlich A N. What does the retina know about natural scenes? Neural Computation, 1992, 4(2): 196–210
Article Google Scholar
Olshausen B A, Field D J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 1996, 381(6583): 607–609
Article Google Scholar
Grenander U. General Pattern Theory. Oxford University Press, 1993
IPAM Summer School: The mathematics of the mind. Tenenbaum J B, Yuille A L, Organizers. IPAM, UCLA. 2007
Jin Y, Geman S. Context and hierarchy in a probabilistic image model. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2006, 2: 2145–2152
Google Scholar
Zhu S C, Mumford D. A stochastic grammar of images. Foundations and Trends in Computer Graphics and Vision, 2006, 2(4): 259–362
Article Google Scholar
Leclerc Y G. Constructing simple stable descriptions for image partitioning. International Journal of Computer Vision, 1989, 3(1): 73–102
Article Google Scholar
Rissanen J. Minimum description length principle. In: Kotz S, Johnson N L, eds. Encyclopedia of Statistical Sciences. New York: John Wiley & Sons, 1987, 5: 523–527
Google Scholar
Geman S, Geman D. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1984, PAMI-6(6): 721–741
Article Google Scholar
Shotton J, Winn J, Rother C, Criminisi A. TextonBoost: Joint appearance, shape and context modeling for multiclass object recognition and segmentation. In: Proceedings of the 9th European Conference on Computer Vision. Lecture notes in computer science, 2006, 3951: 1–15
Google Scholar
Geiger D, Ladendorf B, Yuille A L. Occlusions and binocular stereo. International Journal of Computer Vision, 1995, 14(3): 211–226
Article Google Scholar
Sun J, Shum H-Y, Zheng N-N. Stereo matching using belief propagation. In: Proceedings of the 7th European Conference on Computer Vision. Lecture notes in computer science, 2002, 2351: 510–524
Google Scholar
Blake A, Zisserman A. Visual Reconstruction. Cambridge: MIT Press, 1987
Google Scholar
Geiger D, Yuille A L. A common framework for image segmentation. International Journal of Computer Vision, 1991, 6(3): 227–243
Article Google Scholar
Black M J, Rangarajan A. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International Journal of Computer Vision, 1996, 19(1): 57–91
Article Google Scholar
Zhu S C, Mumford D. Prior learning and Gibbs reaction diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(11): 1236–1250
Article Google Scholar
Roth S, Black M J. Fields of experts. International Journal of Computer Vision, 2009, 82(2): 205–229
Article Google Scholar
Boykov Y, Kolmogorov V. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. In: proceedings of the Energy Minimization Methods in Computer Vision and Pattern Recognition. Lecture Notes in Computer Science, 2001, 2134: 359–374
Article Google Scholar
Koch C, Marroquin J, Yuille A L. Analog “neuronal” networks in early vision. Proceedings of the National Academy of Sciences of the United States of America, 1986, 83(12): 4263–4267
Article MathSciNet Google Scholar
Yedidia J S, Freeman W T, Weiss Y. Generalized belief propagation. Advances in Neural Information Processing Systems, 2001, 13: 689–695
Google Scholar
Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Conference on Computer Vision and Pattern Recognition. 2001, 1: I−511–I−518
Google Scholar
Konishi S, Yuille A L, Coughlan J M, Zhu S C. Fundamental bounds on edge detection: An information theoretic evaluation of different edge cues. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition. 1999, 1: 573–579
Google Scholar
Lafferty J, McCallum A, Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning. 2001, 282–289
Zhu S C, Wu Y N, Mumford D. Minimax entropy principle and its application to texture modeling. Neural Computation, 1997, 9(8): 1627–1660
Article Google Scholar
Parisi G. Statistical Field Theory. Addison Wesley, 1988
Hopfield J J, Tank D W. “Neural” computation of decisions in optimization problems. Biological Cybernetics, 1985, 52(3): 141–152
MATH MathSciNet Google Scholar
Saul L, Jordan M. Exploiting tractable substructures in intractable networks. Advances in Neural Information Processing Systems, 1995, 8: 486–492
Google Scholar
Wainwright M J, Jaakkola T S, Willsky A S. Tree-based reparameterization framework for analysis of sum-product and related algorithms. IEEE Transactions on Information Theory, 2003, 49(5): 1120–1146
Article MATH MathSciNet Google Scholar
Bishop C M. Pattern Recognition and Machine Learning. 2nd ed. Springer, 2007
Domb C, Green M S. Phase Transitions and Critical Phenomena. London: Academic Press, 1972
Google Scholar
Neal R M, Hinton G E. A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan M I ed. Learning in Graphical Models. Cambridge: MIT Press, 1999, 355–368
Google Scholar
Tu Z, Zhu S C. Image segmentation by data-driven Markov chain Monte Carlo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(5): 657–673
Article Google Scholar
Tu Z W, Chen X, Yuille A L, Zhu S C. Image parsing: Unifying segmentation, detection, and recognition. International Journal of Computer Vision, 2005, 63(2): 113–140
Article Google Scholar
Zhu L, Chen Y, Lin Y, Yuille A L. A hierarchical image model for polynomial-time 2D parsing. In: Proceedings of Neural Information Processing Systems Foundation. 2008
Zhu S C, Yuille A L. Region competition: Unifying snakes, region growing and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(9): 884–900
Article Google Scholar
Gilks W R, Richardson S, Spiegelhalter D J. Markov Chain Monte Carlo in Practice. Chapman & Hall, 1996
Freund Y, Schapire R. Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning. 1996, 148–156
Chen X, Yuille A L. A time-efficient cascade for real-time object detection: With applications for the visually impaired. In: Proceedings of Computer Vision and Pattern Recognition. 2005, 28
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. 2nd ed. Springer, 2009
Belongie S, Malik J, Puzicha J. Matching shapes. In: Proceedings of the Eighth IEEE International Conference on Computer Vision. 2001, 1: 454–461
Article Google Scholar
Cootes T F, Edwards G J, Taylor C J. Active appearance models. In: Proceedings of the 5th European Conference on Computer Vision. Lecture Notes in Computer Science, 1998, 1407: 484–498
Google Scholar
Tu Z, Yuille A L. Shape matching and recognition: Using generative models and informative features. In: Proceedings of the 8th European Conference on Computer Vision. 2004, 3: 195–209
Google Scholar
Martin D, Fowlkes C, Tal D, Malik J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of the Eighth International Conference on Computer Vision. 2001, 2: 416–423
Article Google Scholar
Guo C E, Zhu S C, Wu Y N. Primal sketch: Integrating structure and texture. Computer Vision and Image Understanding, 2007, 106(1): 5–19
Article Google Scholar
Chen H, Xu Z, Liu Z, Zhu S C. Composite templates for cloth modeling and sketching. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2006, 1: 943–950
Google Scholar
Zhu L, Lin C, Huang H, Chen Y, Yuille A L. Unsupervised structure learning: Hierarchical recursive composition, suspicious coincidence and competitive exclusion. In: Proceedings of the 10th European Conference on Computer Vision. Lecture Notes in Computer Science, 2008, 5303: 759–773
Google Scholar
Zhu L, Chen Y, Lu Y, Lin C, Yuille A L. Max margin AND/OR graph learning for parsing the human body. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008
Zhu L, Chen Y, Ye X, Yuille A L. Structure-perceptron learning of a hierarchical log-linear model. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008
Collins M. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2002, 1–8
Coughlan J M, Yuille A L. Bayesian A* tree search with expected O(N) convergence rates for road tracking. In: Proceedings of Energy Minimization Methods in Computer Vision and Pattern Recognition. Lecture Notes in Computer Science, 1999, 1654: 189–204
Article Google Scholar
Yuille A L, Coughlan J M. Fundamental limits of Bayesian inference: Order parameters and phase transitions for road tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(2): 160–173
Article Google Scholar
Yuille A L, Coughlan J M. An A* perspective on deterministic optimization for deformable templates. Pattern Recognition, 2000, 33(4): 603–616
Article Google Scholar
Yuille A L, Coughlan J M. High-level and generic models for visual search: When does high level knowledge help? In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1999, 2: 631–637
Google Scholar
Yuille A L, Coughlan J M, Wu Y N, Zhu S C. Order parameters for detecting target curves in images: When does high level knowledge help? International Journal of Computer Vision, 2001, 41(1–2): 9–33
Article MATH Google Scholar
Fischler M A, Elschlager R A. The representation and matching of pictorial structures. IEEE Transactions on Computers, 1973, C-22(1): 67–92
Article Google Scholar
Yuille A L, Hallinan P W, Cohen D S. Feature extraction from faces using deformable templates. International Journal of Computer Vision, 1992, 8(2): 99–111
Article Google Scholar
Geman D, Jedynak B. An active testing model for tracking roads in satellite images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(1): 1–14
Article Google Scholar
Coughlan J M, Yuille A L, English C, Snow D. Efficient deformable template detection and localization without user initialization. Computer Vision and Image Understanding, 2000, 78(3): 303–319
Article Google Scholar
Chui H, Rangarajan A. A new algorithm for non-rigid point matching. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2000, 2: 44–51
Google Scholar
Felzenszwalb P F, Huttenlocher D P. Pictorial structures for object recognition. International Journal of Computer Vision, 2005, 61(1): 55–79
Article Google Scholar
Fergus R, Perona P, Zisserman A. A sparse object category model for efficient learning and exhaustive recognition. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 1: 380–387
Google Scholar
Konishi S, Yuille A L, Coughlan J M, Zhu S C. Statistical edge detection: Learning and evaluating edge cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(1): 57–74
Article Google Scholar
Viola P, Wells W M III. Alignment by maximization of mutual information. International Journal of Computer Vision, 1997, 24(2): 137–154
Article Google Scholar
Rajwade A, Banerjee A, Rangarajan A. Probability density estimation using isocontours and isosurfaces: Applications to information-theoretic image registration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(3): 475–491
Article Google Scholar
Gibson J J. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin, 1979
Google Scholar
Blake A, Yuille A L. Active Vision. Cambridge: MIT Press, 1992
Google Scholar
Soatto S. Actionable information in vision. In: Proceedings of the International Conference on Computer Vision. 2009, 2425

Download references

Author information

Authors and Affiliations

Department of Statistics, University of California at Los Angeles, Los Angeles, CA, 90095, USA
Alan Yuille

Authors

Alan Yuille
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alan Yuille.

Additional information

Alan Yuille received his B.A. in mathematics from the University of Cambridge in 1976, and completed his Ph.D. in theoretical physics at Cambridge in 1980 studying under Stephen Hawking. Following this, he held a postdoc position with the Physics Department, University of Texas at Austin, and the Institute for Theoretical Physics, Santa Barbara. He then joined the Artificial Intelligence Laboratory at MIT (1982-1986), and followed this with a faculty position in the Division of Applied Sciences at Harvard (1986–1995), rising to the position of associate professor. From 1995–2002 Alan worked as a senior scientist at the Smith-Kettlewell Eye Research Institute in San Francisco. In 2002 he accepted a position as full professor in the Department of Statistics at the University of California, Los Angeles. He has over one hundred and fifty peer-reviewed publications in vision, neural networks, and physics, and has co-authored two books: Data Fusion for Sensory Information Processing Systems (with J. J. Clark) and Two- and Three-Dimensional Patterns of the Face (with P. W. Hallinan, G. G. Gordon, P. J. Giblin and D. B. Mumford); he also co-edited the book Active Vision (with A. Blake). He has won several academic prizes and is a Fellow of IEEE.

About this article

Cite this article

Yuille, A. An information theory perspective on computational vision. Front. Electr. Electron. Eng. China 5, 329–346 (2010). https://doi.org/10.1007/s11460-010-0107-x

Download citation

Received: 19 April 2010
Accepted: 29 April 2010
Published: 05 August 2010
Issue Date: September 2010
DOI: https://doi.org/10.1007/s11460-010-0107-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An information theory perspective on computational vision

Abstract

Access this article

Similar content being viewed by others

Stochastic Methods for Image Analysis

Basic Models of Descriptive Image Analysis

Tree-based iterated local search for Markov random fields with applications in image analysis

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Keywords

Navigation

An information theory perspective on computational vision

Abstract

Access this article

Similar content being viewed by others

Stochastic Methods for Image Analysis

Basic Models of Descriptive Image Analysis

Tree-based iterated local search for Markov random fields with applications in image analysis

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Share this article

Keywords

Search

Navigation