Skip to main content
Log in

A Klein-Bottle-Based Dictionary for Texture Representation

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

A natural object of study in texture representation and material classification is the probability density function, in pixel-value space, underlying the set of small patches from the given image. Inspired by the fact that small \(n\times n\) high-contrast patches from natural images in gray-scale accumulate with high density around a surface \(\fancyscript{K}\subset {\mathbb {R}}^{n^2}\) with the topology of a Klein bottle (Carlsson et al. International Journal of Computer Vision 76(1):1–12, 2008), we present in this paper a novel framework for the estimation and representation of distributions around \(\fancyscript{K}\), of patches from texture images. More specifically, we show that most \(n\times n\) patches from a given image can be projected onto \(\fancyscript{K}\) yielding a finite sample \(S\subset \fancyscript{K}\), whose underlying probability density function can be represented in terms of Fourier-like coefficients, which in turn, can be estimated from \(S\). We show that image rotation acts as a linear transformation at the level of the estimated coefficients, and use this to define a multi-scale rotation-invariant descriptor. We test it by classifying the materials in three popular data sets: The CUReT, UIUCTex and KTH-TIPS texture databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. Available at http://www.kyb.tuebingen.mpg.de/?id=227.

  2. Provided one has a “continuous projection” such as the one described in Sect. 3.1.

  3. Preprint available at http://arxiv.org/abs/1112.1993.

  4. Convergence is with respect to the weak-* topology. This result is a consequence of the \(N\)-representation theorem (V.14, p. 143, Reed and Simon (1972)).

  5. http://sipi.usc.edu/database.

  6. http://www.cs.columbia.edu/CAVE/software/curet.

  7. http://www.nada.kth.se/cvap/databases/kth-tips.

  8. http://www-cvr.ai.uiuc.edu/ponce_grp/data.

  9. http://www.cse.wustl.edu/~kilian/code/code.html.

References

  • Aherne, F. J., Thacker, N. A., & Rockett, P. I. (1998). The bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika, 34(4), 363–368.

    MATH  MathSciNet  Google Scholar 

  • Bell, A. J., & Sejnowski, T. J. (1997). The “independent components” of natural scenes are edge filters. Vision Research, 37(23), 3327.

    Article  Google Scholar 

  • Beyer, K., Goldstein, J., Ramakrishnan, R., and Shaft, U. (1999). When is “nearest neighbor” meaningful? Database Theory-ICDT’99 (pp. 217–235).

  • Broadhurst, R. E. (2005). Statistical estimation of histogram variation for texture classification. In Proc. Intl. Workshop on texture analysis and synthesis. (pp. 25–30).

  • Brodatz, P. (1966). Textures: A photographic album for artists and designers (Vol. 66). New York: Dover.

    Google Scholar 

  • Carlsson, G. (2009). Topology and data. Bulletin of the American Mathematical Society, 46(2), 255.

    Article  MATH  MathSciNet  Google Scholar 

  • Carlsson, G., Ishkhanov, T., De Silva, V., & Zomorodian, A. (2008). On the local behavior of spaces of natural images. International Journal of Computer Vision, 76(1), 1–12.

    Article  Google Scholar 

  • Crosier, M., & Griffin, L. D. (2010). Using basic image features for texture classification. International Journal of Computer Vision, 88(3), 447–460.

    Article  MathSciNet  Google Scholar 

  • Dana, K. J., Van Ginneken, B., Nayar, S. K., & Koenderink, J. J. (1999). Reflectance and texture of real-world surfaces. ACM Transactions on Graphics (TOG), 18(1), 1–34.

    Article  Google Scholar 

  • De Silva, V., Morozov, D., & Vejdemo-Johansson, M. (2011). Persistent cohomology and circular coordinates. Discrete and Computational Geometry, 45(4), 737–759.

    Article  MATH  MathSciNet  Google Scholar 

  • De Wit, T. D., & Floriani, E. (1998). Estimating probability densities from short samples: A parametric maximum likelihood approach. Physical Review E, 58(4), 5115.

    Article  Google Scholar 

  • Edelman, A., & Murakami, H. (1995). Polynomial roots from companion matrix eigenvalues. Mathematics of Computation, 64(210), 763–776.

    Article  MATH  MathSciNet  Google Scholar 

  • Franzoni, G. (2012). The klein bottle: Variations on a theme. Notices of the AMS, 59(8), 1094–1099.

    MathSciNet  Google Scholar 

  • Lewis, D. (2005). Feature classes for 1D, 2nd order image structure arise from natural image maximum likelihood statistics. Network, 16(2–3), 301–320.

    Google Scholar 

  • Griffin, L. D. (2007). The second order local-image-structure solid. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(8), 1355–1366.

    Article  Google Scholar 

  • Harris, C. & Stephens, M. (1988) A combined corner and edge detector. In Proc. of Fourth Alvey Vision Conference. (pp. 147–151).

  • Hatcher, A. (2002). Algebraic topology. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Hayman, E., Caputo, B., Fritz, M., & Eklundh, J. O. (2004). On the significance of real-world conditions for material classification. Computer Vision ECCV, 3024, 253–266.

    Google Scholar 

  • Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurones in the cat’s striate cortex. The Journal of Physiology, 148(3), 574–591.

    Google Scholar 

  • Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 195(1), 215–243.

    Google Scholar 

  • Jurie, F. and Triggs, B. (2005). Creating efficient codebooks for visual recognition. In Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, IEEE. (Vol. 1, pp. 604–610).

  • Koenderink, J. J. (1984). The structure of images. Biological Cybernetics, 50(5), 363–370.

    Article  MATH  MathSciNet  Google Scholar 

  • Lazebnik, S., Schmid, C., & Ponce, J. (2005). A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1265–1278.

    Article  Google Scholar 

  • Lee, A. B., Pedersen, K. S., & Mumford, D. (2003). The nonlinear statistics of high-contrast patches in natural images. International Journal of Computer Vision, 54(1), 83–103.

    Google Scholar 

  • Leung, T., & Malik, J. (2001). Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision, 43(1), 29–44.

    Article  MATH  Google Scholar 

  • Moler, C. (1991). Cleve’s corner: Roots-of polynomials, that is. The MathWorks Newsletter, 5(1), 8–9.

    Google Scholar 

  • Pedersen, Kim S and Lee, Ann B. (2002). Toward a full probability model of edges in natural images. In Computer Vision-ECCV 2002, Springer (pp. 328–342).

  • Reed, M., & Simon, B. (1972). Methods of modern mathematical physics: Functional analysis (Vol. 1). New York: Academic Press.

    MATH  Google Scholar 

  • Rubner, Y., Tomasi, C., & Guibas, L. J. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121.

    Article  MATH  Google Scholar 

  • Silverman, B. W. (1986). Density estimation for statistics and data analysis (Vol. 26). London: Chapman & Hall.

    Book  MATH  Google Scholar 

  • van Hateren, J. H., & van der Schaaf, A. (1998). Independent component filters of natural images compared with simple cells in primary visual cortex. In Proceedings of the Royal Society of London. Series B: Biological Sciences, (Vol. 265(1394) pp. 359–366).

  • Varma, M. and Ray, D. (2007). Learning the discriminative power-invariance trade-off. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, IEEE. (pp. 1–8).

  • Varma, M., & Zisserman, A. (2004). Unifying statistical texture classification frameworks. Image and Vision Computing, 22(14), 1175–1183.

    Article  Google Scholar 

  • Varma, M., & Zisserman, A. (2005). A statistical approach to texture classification from single images. International Journal of Computer Vision, 62(1), 61–81.

    Article  Google Scholar 

  • Varma, M., & Zisserman, A. (2009). A statistical approach to material classification using image patch exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(11), 2032–2047.

    Article  Google Scholar 

  • Watson, G. S. (1969). Density estimation by orthogonal series. The Annals of Mathematical Statistics, 40(4), 1496–1498.

    Article  MATH  Google Scholar 

  • Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.

    MATH  Google Scholar 

  • Zhang, J., Marszalek, M., Lazebnik, S., & Schmid, C. (2007). Local features and kernels for classification of texture and object categories: A comprehensive study. International Journal of Computer Vision, 73(2), 213–238.

    Article  Google Scholar 

Download references

Acknowledgments

Jose Perea was partially supported by the National Science Foundation (NSF) through grant DMS 0905823. Gunnar Carlsson was supported by the NSF through grants DMS 0905823 and DMS 096422, by the Air Force Office of Scientific Research through grants FA9550-09-1-0643 and FA9550-09-1-0531, and by the National Institutes of Health through grant I-U54-ca149145-01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose A. Perea.

Appendix: Main Results and Proofs

Appendix: Main Results and Proofs

Proof

(Proposition 1)

  1. 1.

    The Cauchy-Schwartz inequality implies that for every \(\mathbf {v}\in S^1\) and every almost-everywhere differentiable \(I_P\) (not necessarily purely directional) one has that

    $$\begin{aligned}Q_P(\mathbf {v}) \le \int \int \limits _{[-1 ,1]^2} \Vert \nabla I_P \Vert ^2 dxdy.\end{aligned}$$

    Since the equality holds when \(I_P(x,y) = g(ax + by)\) and \(\mathbf {v} = \begin{bmatrix}a\\ b\end{bmatrix}\), the result follows.

  2. 2.

    Let \(\lambda _{max} \ge \lambda _{min} \ge 0\) be the eigenvalues of \(A_P\), and let \(B_P\) be a unitary matrix such that

    $$\begin{aligned}A_PB_P = \begin{bmatrix}\lambda _{max}&0 \\ 0&\lambda _{min}\end{bmatrix}.\end{aligned}$$

    If \(\mathbf {v}\in S^1\) and \( \mathbf {v} = B_P\begin{bmatrix}v_x \\ v_y\end{bmatrix}\), then \(1 = \Vert \mathbf {v}\Vert ^2 = v_x^2 +v_y^2\), \(Q_P(\mathbf {v}) = \lambda _{max} v_x^2 + \lambda _{min}v_y^2\) and therefore

    $$\begin{aligned}\max _{\Vert \mathbf {v}\Vert =1}Q_P(\mathbf {v})= \lambda _{max}.\end{aligned}$$

    Finally, since \(Q_P(\mathbf {v}) = \lambda _{max}\) for \(\mathbf {v}\in S^1\), if and only if \(\mathbf {v} \in E_{max}(A_P)\), we obtain the result.

  3. 3.

    If the eigenvalues of \(A_P\) are distinct, then \(E_{max}(A_P)\) is one dimensional and thus intercepts \(S^1\) at exactly two antipodal points.\(\square \)

Proof

(Proposition 2) Since \(u\) and \(\sqrt{3}u^2\) are orthonormal with respect to \(\langle \cdot , \cdot \rangle _D\), then it follows that

$$\begin{aligned} \mathop {\text {argmin }}\limits _{c^2 + d^2 = 1} \varPhi (c,d)&= \mathop {\text {argmin }}\limits _{c^2 +d^2 =1} \varPhi (c,d)^2\\&= \mathop {\text {argmax }}\limits _{c^2 + d^2 =1} \left\langle I_P , cu + d\sqrt{3}u^2 \right\rangle _D \\&= \mathop {\text {argmax }}\limits _{c^2 + d^2 =1} \left\langle \begin{bmatrix}c \\ d\end{bmatrix}, \begin{bmatrix}\langle I_P ,u \rangle _D \\ \\ \langle I_P, \sqrt{3}u^2\rangle _D\end{bmatrix} \right\rangle \end{aligned}$$

which can be found via the usual condition of parallelism, provided \(\varphi (I_P,\alpha ) \ne 0\). \(\square \)

Proof

(Proposition  3) Let \(g\in L^2(T)\) and consider

$$\begin{aligned} g^\perp (z,w)= \frac{g(z,w) + g\left( -z,-\overline{w}\right) }{2} \end{aligned}$$

It follows that \(g^\perp \) is square-integrable on \(T\), and that:

  1. 1.

    \(g^\perp \left( -z,-\overline{w}\right) = g^\perp (z,w)\) for every \((z,w)\in T\). Hence \(g^\perp \in L^2(K)\) for every \(g\in L^2(T)\).

  2. 2.

    If \(g_1,g_2\in L^2(T)\) and \(a_1,a_2\in {\mathbb {C}}\) then

    $$\begin{aligned} (a_1g_1 + a_2g_2)^\perp = a_1(g_1^\perp ) + a_2 (g_2)^{\perp } \end{aligned}$$
  3. 3.

    \(g^\perp = g\) for every \(g\in L^2(K)\).

We claim that \(g^\perp \) is the orthogonal projection of \(g\in L^2(T)\) onto \(L^2(K)\), and all we have to check is that \(g -g^\perp \) is perpendicular to every \(h\in L^2(K)\). To this end, let us introduce the notation

$$\begin{aligned} g^*(z,w) = g(-z,-\overline{w}). \end{aligned}$$

By writing the inner product of \(L^2(T)\) in polar coordinates, using the substitution \((\alpha ,\theta )\mapsto (\alpha + \pi ,\pi - \theta )\), and the fact that \(h\) satisfies Eq. 3.3, one obtains that

$$\begin{aligned} \langle g^*, h\rangle _T&= \frac{1}{(2\pi )^2} \int _{-\frac{\pi }{2}}^{\frac{3\pi }{2}} \int _{\frac{\pi }{4}}^{\frac{9\pi }{4}} g(\alpha + \pi ,\pi -\theta )\overline{h}(\alpha ,\theta ) d\alpha d\theta \\&= \frac{1}{(2\pi )^2} \int _{-\frac{\pi }{2}}^{\frac{3\pi }{2}} \int _{\frac{\pi }{4}}^{\frac{9\pi }{4}} g(\alpha ,\theta )\overline{h} (\alpha + \pi ,\pi -\theta )d\alpha d\theta \\&= \frac{1}{(2\pi )^2} \int _{-\frac{\pi }{2}}^{\frac{3\pi }{2}} \int _{\frac{\pi }{4}}^{\frac{9\pi }{4}} g(\alpha ,\theta )\overline{h} (\alpha ,\pi )d\alpha d\theta =\langle g,h \rangle _T \end{aligned}$$

Therefore \( 2\langle g - g^\perp , h\rangle _T = \langle g, h\rangle _T - \langle g^*, h\rangle _T = 0 \) and we get the result. \(\square \)

Proof

[Theorem 1] By continuity, \(\varPi \) takes spanning sets to spanning sets which is the first part of the theorem. Now, if we consider the decomposition

$$\begin{aligned} L^{2}(T) = L^{2}(K)\oplus L^{2}(K)^{\perp } \end{aligned}$$

where \(L^2(K)^\perp \) denotes the orthogonal linear complement of \(L^2(K)\) in \(L^2(T)\), and \(\fancyscript{B}\) is an orthonormal basis for \(L^2(K)\), then for any orthonormal basis \(\fancyscript{B}^\prime \) of \(L^2(K)^\perp \) we have that \(\fancyscript{B}\cup \fancyscript{B}^\prime \) is an orthonormal basis for \(L^2(T)\). The thing to notice is that since \(\ker (\varPi ) = L^2(K)^\perp \) and \(\varPi \) restricted to \(L^2(K)\) is the identity, then \(\varPi (\fancyscript{B}^\prime ) = \{\mathbf {0}\}\) and therefore \(\varPi (\fancyscript{B}\cup \fancyscript{B}^\prime ) = \fancyscript{B}\cup \{\mathbf {0}\}\). It follows that the only subset of \(\fancyscript{B}\cup \{\mathbf {0}\}\) which can be a basis for \(L^2(K)\) is \(\fancyscript{B}\), and that it is invariant when applying Gram-Schmidt.

Proof

(Theorem  3) If \(f,g\in L^2(K,{\mathbb {R}})\) then from a change of coordinates it follows that \(\langle f^\tau , g \rangle _K = \langle f, g^{-\tau }\rangle _K\), and therefore

$$\begin{aligned} a_m^\tau&= a_m \\ b_n^\tau&= \langle f^\tau , \sqrt{2}\cos (2n\alpha )\rangle _K \\&= \langle f, \sqrt{2}\cos (2n(\alpha + \tau )) \rangle _K \!=\! \cos (2n\tau ) b_n \!-\! \sin (2n\tau ) c_n \\ c_n^\tau&= \langle f, \sqrt{2} \sin (2n(\alpha + \tau )) \rangle _K \\&= \cos (2n\tau ) c_n + \sin (2n\tau )b_n \end{aligned}$$

Similarly

$$\begin{aligned} d_{n,m}^\tau&= \cos (n\tau )d_{n,m} -\sin (n\tau )e_{n,m} \\ e_{n,m}^\tau&= \cos (n\tau )e_{n,m} +\sin (n\tau )d_{n,m}. \end{aligned}$$

\(\square \)

Corollary 4

Let \(T_\tau :\ell ^2({\mathbb {R}}) \longrightarrow \ell ^2({\mathbb {R}})\) be as in Theorem 3. Then

$$\begin{aligned} T_{\tau + \beta } = T_\tau \circ T_\beta . \end{aligned}$$

Proof

(Theorem 4) From the description of \(T_\tau \) as a block diagonal matrix whose blocks are rotation matrices (Theorem 3), it follows that

$$\begin{aligned} \mathop {\text {argmin }}\limits _{\tau } \varPsi (-\tau )&= \mathop {\text {argmax }}\limits _{\tau } \left\langle K\fancyscript{F}_w(f), T_\tau \left( K\fancyscript{F}_w(g)\right) \right\rangle \\&= \mathop {\text {argmax }}\limits _{\tau } \sum _{n=1}^{w} x_n \cos (n\tau ) + y_n\sin (n\tau ) \end{aligned}$$

where \(x_n\) and \(y_n\) depend solely on \(K\fancyscript{F}_w(f)\) and \(K\fancyscript{F}_w(g)\). By taking the derivative with respect to \(\tau \), we get that if \(\tau ^*\) is a minimizer for \(\varPsi (\tau )\) then we must have

$$\begin{aligned} \sum _{n=1}^w ny_n \cos (n\tau ^*) - nx_n\sin (n\tau ^*) = 0 \end{aligned}$$

which is equivalent to having

$$\begin{aligned} Re\left( \sum _{n=1}^w n(y_n + i x_n)\xi _*^n\right) = 0. \end{aligned}$$

Let \(q(z)= \sum \limits _{n=1}^w n(y_n + ix_n)z^n,\; \bar{q}(z)= \sum \limits _{n=1}^w n(y_n - ix_n)z^n\) and

$$\begin{aligned} p(z) = z^{w}\cdot \left( q(z) + \bar{q}\left( \frac{1}{z}\right) \right) . \end{aligned}$$

It follows that \(p(z)\) is a complex polynomial of degree less than or equal to \(2w\), and so that

$$\begin{aligned} p\left( \xi _*\right)&= \xi _*^w \left( q\left( \xi _*\right) + \bar{q}\left( \frac{1}{\xi _*}\right) \right) = \xi _*^w \left( q\left( \xi _*\right) + \bar{q}\left( \overline{\xi _*}\right) \right) \\&= 2\xi _*^w \cdot Re\left( q\left( \xi _*\right) \right) = 0. \end{aligned}$$

\(\square \)

Corollary 5

The vector \(T_\tau \left( \widehat{K\fancyscript{F}}(f)\right) \) is a componentwise unbiased estimator for \(K\fancyscript{F}(f^\tau )\), which converges almost surely as the sample size tends to infinity.

Proof

(Proposition  4)Let \(\widehat{\mathbf {v}}^\tau \) be the vector with entries \(\widehat{d}_{1,1}^\tau \) and \(\widehat{e}_{1,1}^\tau \) for \(\widehat{K\fancyscript{F}}(f^\tau ,S^\tau )\). It follows from Theorem 3 and Corollaries 4 and 5 that

$$\begin{aligned} \widehat{\mathbf {v}}^\tau&= \begin{bmatrix}\cos (\tau )&-\sin (\tau ) \\ \sin (\tau )&\cos (\tau )\end{bmatrix} \mathbf {\widehat{v}} \\&= \begin{bmatrix}\cos \left( \tau -\sigma (f)\right)&-\sin \left( \tau -\sigma (f)\right) \\ \sin \left( \tau -\sigma (f)\right)&\cos \left( \tau -\sigma (f)\right) \end{bmatrix} \begin{bmatrix}\Vert \mathbf {\widehat{v}}\Vert \\ \\ 0\end{bmatrix}\\&= \begin{bmatrix}\cos \left( \tau -\sigma (f)\right)&-\sin \left( \tau -\sigma (f)\right) \\ \sin \left( \tau -\sigma (f)\right)&\cos \left( \tau -\sigma (f)\right) \end{bmatrix} \begin{bmatrix}\left\| \mathbf {\widehat{v}}^\tau \right\| \\ \\ 0\end{bmatrix} \end{aligned}$$

from which we get \(\sigma \left( f^\tau \right) \equiv \sigma (f) - \tau \;\; (\mathrm {mod}\;\;2\pi )\), and

$$\begin{aligned} T_{\sigma \left( f^\tau \right) }\left( \widehat{K\fancyscript{F}}(f^\tau ,S^\tau )\right)&= T_{\sigma (f) - \tau } \circ T_\tau \left( \widehat{K\fancyscript{F}}(f,S)\right) \\&= T_{\sigma (f)} \left( \widehat{K\fancyscript{F}}(f,S)\right) \end{aligned}$$

as claimed. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Perea, J.A., Carlsson, G. A Klein-Bottle-Based Dictionary for Texture Representation. Int J Comput Vis 107, 75–97 (2014). https://doi.org/10.1007/s11263-013-0676-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-013-0676-2

Keywords

Navigation