Textons refer to fundamental micro-structures in generic natural images and thus constitute the basic elements in early (pre-attentive) visual perception. However, the word “texton” remains a vague concept in the literature of computer vision and visual perception, and a precise mathematical definition has yet to be found. In this article, we argue that the definition of texton should be governed by a sound mathematical model of images, and the set of textons must be learned from, or best tuned to, an image ensemble. We adopt a generative image model that an image is a superposition of bases from an over-complete dictionary, then a texton is defined as a mini-template that consists of a varying number of image bases with some geometric and photometric configurations. By analogy to physics, if image bases are like protons, neutrons and electrons, then textons are like atoms. Then a small number of textons can be learned from training images as repeating micro-structures. We report four experiments for comparison. The first experiment computes clusters in feature space of filter responses. The second use transformed component analysis in both feature space and image patches. The third adopts a two-layer generative model where an image is generated by image bases and image bases are generated by textons. The fourth experiment shows textons from motion image sequences, which we call movetons.
- Independent Component Analysis
- Cluster Center
- Natural Image
- Independent Component Analysis
- Image Patch
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Barlow, H.B. “Possible principles underlying the transformation of sensory messages”. In Sensory Communication, ed. W.A. Rosenblith, pp217–234, MIT Press, Cambridge, MA, 1961.
Bell, A. J. and Sejnowski, T.J. “An information maximization approach to blind separation and blind deconvolution”, Neural Computation, 7(6): 1129–1159, 1995.
Buccigrossi, R.W. and Simoncelli, E.P. “Image compression via joint statistical characterization in the wavelet domain”, IEEE trans on Image Processing, 8(12):1688–701, 1999.
Coifman, R.R. and Wickerhauser, M.V. “Entropy based algorithms for best basis selection.” IEEE Trans. on Information Theory., Vol. 38, pp713–718, 1992.
Donoho, D.L. Vetterli, M. DeVore, R.A. and Daubechie, I “Data compression and harmonic analysis”, IEEE Trans. Information Theory. 6, 2435–2476, 1998.
Frey, B. and Jojic, N. “Transformed component analysis: joint estimation of spatial transforms and image components”, Proc. of Int’l Conf. on Comp. Vis., Corfu, Greece, 1999.
Guo, C. E. Zhu, S. C. and Wu, Y. N. “Visual learning by integrating descriptive and generative methods”, Proc. of Int’l Conf. on Computer Vision, Vancouver, CA, July, 2001.
Julesz, B. “Textons, the elements of texture perception and their interactions”, Nature, 290, 91–97, 1981.
Koloydenko, A. Modeling natural microimage statistics, Ph.D. Thesis, Dept. of Math and Stat., UMass, Amherst, 2000.
Lee, A.B. Huang, J.G. and Mumford, D.B. “Random collage model for natural images”, Int’l J. of Computer Vision, oct. 2000.
Leung, T. and Malik, J. “Recognizing surface using three-dimensional textons”, Proc. of 7th ICCV, Corfu, Greece, 1999.
Mallat, S. G. “A theory for multiresolution signal decomposition: the wavelet representation”, IEEE Trans. on PAMI, vol. 11, no. 7, 674–693, 1989.
Olshausen, B. A. and Field, D. J. “Sparse coding with an over-complete basis set: A strategy employed by V1?”, Vision Research, 37:3311–3325, 1997.
Simoncelli, E.P. Freeman, W.T. Adelson, E.H. Heeger, D.J. “Shiftable multiscale transforms”, IEEE Trans. on Info. Theory, 38(2): 587–607, 1992.
Zhu, S.C. and Mumford, D.B.“Prior learning and Gibbs reaction-diffusion”, IEEE Trans. PAMI, vol. 19, no. 11, Nov. 1997.
Editors and Affiliations
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, SC., Guo, Ce., Wu, Y., Wang, Y. (2002). What Are Textons?. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds) Computer Vision — ECCV 2002. ECCV 2002. Lecture Notes in Computer Science, vol 2353. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47979-1_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43748-2
Online ISBN: 978-3-540-47979-6
eBook Packages: Springer Book Archive