# Adaptive Matrices and Filters for Color Texture Classification

- First Online:

DOI: 10.1007/s10851-012-0356-9

- Cite this article as:
- Giotis, I., Bunte, K., Petkov, N. et al. J Math Imaging Vis (2013) 47: 79. doi:10.1007/s10851-012-0356-9

## Abstract

In this paper we introduce an integrative approach towards color texture classification and recognition using a supervised learning framework. Our approach is based on Generalized Learning Vector Quantization (GLVQ), extended by an adaptive distance measure, which is defined in the Fourier domain, and adaptive filter kernels based on Gabor filters. We evaluate the proposed technique on two sets of color texture images and compare results with those other methods achieve. The features and filter kernels learned by GLVQ improve classification accuracy and they are able to generalize much better for data previously unknown to the system.

### Keywords

Adaptive metric Adaptive filters Classification Color texture analysis Gabor filters Learning Vector Quantization## 1 Introduction

Texture analysis and classification are topics of interest due to their numerous possible applications, such as medical imaging, industrial quality control and remote sensing. A wide variety of methods for texture analysis has been developed such as co-occurrence matrices [11], Markov random fields [34], autocorrelation methods [24, 29], Gabor filtering [6, 9, 16, 18, 21, 32] and wavelet decomposition [33]. These methods mostly concern intensity images and since color information is a vector quantity an adaptation to the color domain is not always straightforward. Regarding color texture, the possible approaches can be divided in three categories [25] called parallel, sequential and integrative. In the parallel approach [22, 27] textural features are extracted solely from the luminance plane of an image and are used together with color features. The sequential approach [12] involves a quantization of the color space and subsequently the extraction of statistical features from the indexed images.

The integrative approach [5, 14, 15, 20, 25] is the most popular one and it describes color texture by combining color information with the spatial relationships of image regions within each color channel and between different color channels. The simplest integrative approach would only consist of a gray scale transformation of the input image but in many cases this has been proven insufficient. A very common advance of the integrative approach is based on the opponent-process theory of human color vision that has its roots in neuroscience. Ewald Hering [13] first noted that there are some color combinations that humans are not able to see, such as reddish-green or yellowish-blue, since these colors contrast each other strongly. Hence, he proposed that such color combinations can be the components of one vision mechanism that oppose each other through a process of excitatory and inhibitory responses. A popular application of this theory in computer vision is the Gaussian color model [8].

In this contribution we introduce a novel integrative approach towards color texture classification and recognition based on adaptive filters through supervised learning. The kernels we use are initialized as two-dimensional Gabor filters. A 2D Gabor filter acts as a local band-pass filter and can achieve optimal joint localization both in the spatial and frequency domains [4]. Given a set of labeled color images (RGB) for training and a bank of 2D Gabor filter kernels the goal here is to learn a transformation of a color image to a single channel (intensity) image and an optimal adaptation of the kernels such that the responses of the transformed images when filtered with the optimized kernels will yield the best possible classification.

Many signal processing techniques are based on insights or empirical observations from neurophysiology or optical physics. The proposed novel approach incorporates data-driven adaptation of the system, e.g. example based learning. Furthermore, the “family” of filters used in our approach can be substituted, depending on the data domain and the task at hand. As an example we explore in this paper the use of rotation and scale invariant descriptors based on Gabor filter responses [10]. We demonstrate that our approach yields very good generalization ability.

The paper is structured as follows: In Sects. 2 and 3 we present overviews of the existing approaches for color texture analysis and the Learning Vector Quantization algorithm respectively. In Sect. 4 the Color Image Analysis LVQ is explained in detail and Sect. 5 presents experimental results. Finally, in Sect. 6 we draw conclusions.

## 2 Overview of Existing Approaches

In texture analysis Gabor filter responses and Local Binary Patterns are two very popular types of descriptor that have been extended to color texture via integrative approaches that are using the opponent color model.

*h*

_{imn}be the response of the

*i*-th color channel of a given image when filtered with a Gabor kernel with scale

*m*and orientation

*n*.The unichrome features are defined as the square root of the energy of the Gabor responses:

*P*,

*R*) is used to denote pixel neighborhoods formed by

*P*sampling points on a circle of radius

*R*. Another extension to the original operator is the definition of the so called uniform patterns, which can be used to reduce the length of the feature vector and implement a simple rotation-invariant descriptor. This extension was inspired by the fact that some binary patterns occur more commonly in texture images than others. A local binary pattern is called uniform if it contains at most two transitions from 0 to 1 or vice versa when it is traversed circularly. Ojala et al. [24] noticed in their experiments that uniform patterns account for a little less than 90 % of all patterns when using a (8,1) neighborhood and for around 70 % with a (16,2) neighborhood. After the LBP labeled image

*f*

_{l}(

*x*,

*y*) has been obtained, the descriptor is defined as:

*n*different color channels. The neighborhood to be thresholded can also be taken from these channels, which makes up a total of

*n*

^{2}different combinations. The

*n*

^{2}histogram descriptors are then concatenated into a single feature vector.

## 3 Review of the (Generalized Matrix) Learning Vector Quantization

Learning Vector Quantization (LVQ) is a supervised prototype-based classification method [17]. The training is based on data points **x**^{i}∈ℝ^{D} and their corresponding label information *y*^{i}∈{1,…,*C*}, where *D* denotes the dimension of the feature vectors and *C* the number of classes. A set of prototypes is characterized by their location in the feature space **w**^{i}∈ℝ^{D} and the respective class label *c*(**w**^{i})∈{1,…,*C*}. Classification is implemented as a winner-takes-all scheme. For this purpose, a possibly parameterized dissimilarity measure *d*^{Ω} is defined, where *Ω* specifies the metric parameters which can be adapted during training. Given *d*^{Ω}(**x**,**w**), any data point **x** is assigned to the class label *c*(**w**^{i}) of the closest prototype **w**^{i} with *d*^{Ω}(**x**,**w**^{i})≤*d*^{Ω}(**x**,**w**^{j}) for all *j*≠*i*. The position of the closest (“winner”) prototype in the feature space is then adapted according to a learning rule, i.e. **w**^{i} is moved closer to **x** if the data point is correctly classified and moved away from **x** if otherwise. The number of prototypes used to represent a class can be chosen by the user according to the nature of the data and the task at hand. The typical number of prototypes assigned to each class varies from 1 to 5.

**x**

^{i}from the respective closest correct prototype

**w**

^{J}and the closest wrong prototype

**w**

^{K}.

*Φ*must be a monotonic function and throughout the following the identity

*Φ*(

*x*)=

*x*is used.

*Λ*=

*Ω*

^{⊤}

*Ω*is assumed to be positive (semi-) definite. Hence the measure corresponds to a (squared) Euclidean distance in an appropriately transformed space

*Ω*∈ℝ

^{M×D}with

*M*≤

*D*without loss of generality. For

*M*<

*D*,

*Ω*transforms the

*D*-dimensional data into a lower

*M*-dimensional space. This variant is referred to as Limited Rank Matrix LVQ (LiRaM LVQ) and explained in [1, 2]. The original algorithm follows a stochastic gradient descent for the optimization of the cost function (Eq. (1)). The gradients are evaluated with respect to the contribution of single instances

**x**

^{i}, which are presented in random order and sequentially during training. The algorithm has been introduced and discussed in [31] and will be modified in the subsequent sections.

## 4 Color Image Analysis Learning Vector Quantization

In this contribution we present an extension of the GMLVQ concept, that is especially designed for color texture analysis. We use the same cost function, Eq. (1), as in the original GMLVQ algorithm and follow a stochastic gradient descent procedure where the samples **x**^{i} of the training set are sequentially presented and the parameters accordingly updated. We will refer to this algorithm as Color Image Analysis LVQ (CIA-LVQ) and to one sweep through the training set as one epoch *E*.

Let **D** be a data set of color images of a priorly known size (*p*×*p*) that belong to *C* different classes and a bank of filter kernels **G**, initialized as a sum of Gabor filters with different scales and orientations. The goal is to learn one or more matrices *Ω*_{k} that transform the color images into a single-channel, “intensity” image, a set of optimized kernels \(\widehat{G}_{k}\) and a set of prototypes **w**^{k} such that the filter responses of the transformed images will yield the best possible classification. In addition, we use an adaptation of the learning rates that allows the system to be less dependent on their initial values.

**x**

^{i}∈ℂ

^{N}, where

*N*=

*p*⋅

*p*⋅3, with

*p*denoting the image patch size. These vectors are transformed by

*Ω*

_{k}∈ℂ

^{M×N}, where

*M*=

*p*⋅

*p*. The transformation

*Ω*

_{k}∈ℂ

^{M×N}can be considered as the equivalent of a color to gray scale image transformation, with

*k*referring to the index of a prototype

**w**

^{k}or the index of its class label for class-wise transformations. Subsequently, the transformed image data are filtered with every kernel

*G*

_{l}∈

**G**and the

*l*responses are summed up. The filter kernels are also represented as complex vectors

*G*

_{l}∈ℂ

^{M}. The general form of the descriptor of an individual image is denoted as:

*y*

^{i}∈1,2,…,

*C*.

### 4.1 Explicit Form of the Learning Rules

**w**

^{k},

*Ω*

_{k}and \({\widehat{G}_{k}} \). The parameter updates read as follows: where In Eqs. (9)–(14)

*L*∈{

*J*,

*K*} and

*α*,

*ϵ*and

*η*are the learning rates for the prototypes, the transformation matrix and the kernel used for filtering respectively.

**w**

^{J}and closest wrong prototype

**w**

^{K}together with the corresponding matrices

*Ω*

_{J},

*Ω*

_{K}and the filter kernels \({\widehat{G}}_{J}\), \({\widehat{G}}_{K}\) for the given training data point

**x**

^{i}read: with

^{∗}denoting the complex conjugate and Note, that since we are working with complex values we have to take all derivatives with respect to the real and imaginary parts respectively.

### 4.2 Adaptation of the Learning Rates

Steepest descent methods rely upon the choice of the suitable magnitude for the update step (learning rate). Very small steps usually only slow down convergence, whereas very large steps might result in oscillatory or divergent behavior. In the case of CIA-LVQ the update steps are denoted as *α*,*ϵ* and *η* and the issue of choosing their values is addressed by considering way-point averages over a number of latest iteration steps together with an efficient step size adaptation. This technique is being discussed in [26] for normalized gradients, but in CIA-LVQ we use its basic principles without the normalization.

**x**is an iterative process with an initial learning rate value

*ψ*

_{0}and an initial parameter value

**x**

_{0}. At every iteration step the cost function

*f*

_{c}(

**x**

_{j}) is computed. At first we perform

*k*>1 unaltered gradient steps as follows:

*j*=0,1,…,

*k*−1 with

*ψ*

_{j}=

*ψ*

_{0}. Consequently, apart from the current gradient step \(\tilde{\mathbf {x}}_{t+1}\) we also compute the way-point average of the previous

*k*steps:

**x**

_{t+1}and the new step size

*ψ*

_{t+1}as: As long as a simple gradient descent step yields a position for the parameter

**x**that results in lower costs than the average of the

*k*latest positions of

**x**, the iterative process remains unaltered. On the other hand, \(f_{c}(\tilde{\mathbf {x}}_{t+1}) > f_{c}(\hat{\mathbf {x}}_{t+1})\) indicates that the step size is too large and should be reduced by a factor

*β*.

In the next section we experiment with the algorithm and show its use in practice.

## 5 Experiments

In order to evaluate the usefulness of the proposed algorithm, we perform classification on patches of pictures taken from the VisTex [3] and the KTH-TIPS [7] databases. From the VisTex database we use 29 color images with size 128×128 pixels from the groups Bark, Brick, Fabric and Food. The KTH-TIPS set is used in its original form and consists of 810 color images with size 200×200 pixels from 10 different classes: Sandpaper, Aluminium Foil, Sponge, Styrofoam, Corduroy, Linen, Brown Bread, Cotton, Orange Peel and Cracker. Although in texture classification literature every image is often considered as a different class, here we distinguish into four and ten different classes respectively, which are equivalent to the conceptual groups that the images belong to. Despite its increased difficulty, this classification task allows us to better demonstrate the ability of CIA-LVQ to describe general characteristics of real-world texture patterns.

For our experiments we draw 15×15 patches randomly from each image. The training subsets of images are further divided in training and test sets of patches. The VisTex training subset consists of 200 patches per image. We use 150 patches per image (2400 data points) for training and test the performance of CIA-LVQ on the remaining 50 patches per image (800 data points). With respect to the KTH-TIPS training subset we draw 9 patches per image and use 6 for training (3240 data points) and the remaining 3 (1640 data points) for testing. The test sets may contain patches which partially overlap with those used for training. Therefore we use the images in Figs. 2 and 4 in order to create evaluation sets that have never been seen in the training process and thus better demonstrate the generalization ability of the proposed approach. The evaluation sets consist of 50 and 6 randomly drawn patches per image for VisTex and KTH-TIPS respectively.

A note is due here to the nature of the filters used for initialization. A 2D Gabor filter is defined as a Gaussian kernel function modulated by a sinusoidal plane wave. All filter kernels can be generated from one basic wavelet by dilation and rotation. In these experiments we initialize the adaptive filter banks as follows: Every bank consists of 16 Gabor filters of bandwidth equal to 1 at eight orientations *θ*=0, 22.5, 45, 67.5, 90, 112.5, 135 and 157.5 degrees and two scales (wavelengths) varying by one octave \(\lambda= \{ 5,5\sqrt{2}\}\). These scales ensure that the Gabor function yields an adequate number of visible parallel excitatory and inhibitory stripe zones. Dependent on the patch size and the nature of the data at hand different scales might be more suitable. We set the phase offset *ϕ*=0 and the aspect ratio *γ*=1 for all filters. In this way we create center-on symmetric filters with circular support.

We run the localized version of CIA-LVQ with matrices *Ω*_{k} initialized with the identity matrix and 4 prototypes per class for *E*=300 epochs. The prototypes are initialized as the mean of the corresponding class. Regarding VisTex the training error is 5.75 % and the error on the test set 15 %. For the KTH-TIPS data set CIA-LVQ reaches training and test errors of 15.4 % and 22.8 % respectively.

*s*and in this case the image patch descriptor is given by

*k*-nearest neighbors (

*k*-NN) classification scheme with precisely the set of features and the dissimilarity measure suggested by the authors of [15], whereas for the Color LBP we use rotation-invariant uniform LBP histograms in (8,1) neighborhoods and the Euclidean distance in an

*k*-NN scheme. We choose the size of the neighborhood in relation to the patch size and the dimensions of the feature vectors created. With respect to the RGB2G approach we use the

*k*-NN scheme with a dissimilarity measure similar to Eq. (8):

*k*-NN schemes we cross-validate the number of nearest neighbors using the values

*k*=1,3,…,15 on the testing image patches from the training subsets. The optimal

*k*obtained is then used for experimenting on the previously unseen evaluation image patches. Ties are solved by defaulting to the 1-NN classifier.

### 5.1 Comparisons on the VisTex Data Set

*k*-NN scheme shows a test error of 9.1 % based on the OCF (

*k*=3), 2 % based on the Color LBP (

*k*=1) and 25.8 % based on the RGB2G transformation (

*k*=1), but the most interesting comparison relies on the evaluation set which displays the generalization ability of each method. Here the

*k*-NN scheme produces much higher error rates of 35.2 %, 25.2 % and 50 % for OCF, Color LBP and RGB2G respectively, while the CIA-LVQ has an error of 13.1 %, in the same order of magnitude as for the test patches. Table 1 presents in detail the confusion matrices and classwise accuracies of all methods for the evaluation set.

Confusion matrices for the VisTex evaluation set

CIA-LVQ: | |||||
---|---|---|---|---|---|

Bark | Brick | Fabric | Food | ∑ | |

Bark | 179 | 2 | 23 | 4 | 208 |

Brick | 5 | 85 | 1 | 2 | 93 |

Fabric | 2 | 13 | 176 | 19 | 210 |

Food | 14 | 0 | 0 | 125 | 139 |

∑ | 200 | 100 | 200 | 150 | 650 |

Class-wise accuracy of estimation in % | |||||

89.50 | 85.00 | 88.00 | 83.33 |

OCF: | |||||
---|---|---|---|---|---|

Bark | Brick | Fabric | Food | ∑ | |

Bark | 111 | 10 | 35 | 36 | 192 |

Brick | 70 | 78 | 10 | 26 | 184 |

Fabric | 16 | 12 | 155 | 11 | 194 |

Food | 3 | 0 | 0 | 77 | 80 |

∑ | 200 | 100 | 200 | 150 | 650 |

Class-wise accuracy of estimation in % | |||||

55.50 | 78.00 | 77.50 | 51.33 |

Color LBP: | |||||
---|---|---|---|---|---|

Bark | Brick | Fabric | Food | ∑ | |

Bark | 152 | 24 | 2 | 6 | 174 |

Brick | 21 | 56 | 12 | 1 | 178 |

Fabric | 2 | 12 | 138 | 3 | 127 |

Food | 25 | 8 | 4 | 140 | 181 |

∑ | 200 | 100 | 200 | 150 | 650 |

Class-wise accuracy of estimation in % | |||||

76.00 | 56.00 | 69.00 | 93.33 |

RGB2G: | |||||
---|---|---|---|---|---|

Bark | Brick | Fabric | Food | ∑ | |

Bark | 79 | 12 | 38 | 38 | 167 |

Brick | 64 | 62 | 34 | 28 | 188 |

Fabric | 16 | 15 | 113 | 13 | 157 |

Food | 41 | 11 | 15 | 71 | 138 |

∑ | 200 | 100 | 200 | 150 | 650 |

Class-wise accuracy of estimation in % | |||||

39.50 | 62.00 | 56.50 | 47.33 |

### 5.2 Comparisons on the KTH-TIPS Database

*k*-NN scheme shows a test error of 41.7 % based on the OCF (

*k*=13), 26.4 % based on the Color LBP (

*k*=11) and of 52.7 % based on the RGB2G transformation (

*k*=11), which are all higher than what CIA-LVQ can achieve. On the evaluation set the superior performance of the proposed technique is further clarified. The

*k*-NN scheme reaches error rates of 46.4 %, 35.6 % and 58.4 % for OCF, Color LBP and RGB2G respectively, while the CIA-LVQ has an error of 20.3 %, again in the same order of magnitude as for the test patches. Table 2 presents in detail the confusion matrices and classwise accuracies of all methods for the evaluation set of the KTH-TIPS database.

Confusion matrices for the KTH-TIPS evaluation set

CIA-LVQ: | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

S/paper | Al. Foil | Sponge | Styrofoam | Corduroy | Linen | Br. Bread | Cotton | Or. Peel | Cracker | ∑ | |

S/paper | 129 | 0 | 2 | 0 | 0 | 4 | 0 | 9 | 0 | 0 | 144 |

Al. Foil | 0 | 123 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 123 |

Sponge | 0 | 0 | 107 | 0 | 41 | 0 | 30 | 0 | 0 | 0 | 178 |

Styrofoam | 0 | 4 | 0 | 133 | 0 | 0 | 0 | 0 | 2 | 0 | 139 |

Corduroy | 0 | 0 | 22 | 0 | 91 | 0 | 32 | 0 | 0 | 0 | 145 |

Linen | 0 | 0 | 0 | 2 | 0 | 108 | 0 | 54 | 0 | 0 | 164 |

Br. Bread | 0 | 0 | 3 | 0 | 3 | 0 | 51 | 0 | 0 | 1 | 58 |

Cotton | 1 | 7 | 0 | 0 | 0 | 23 | 0 | 70 | 2 | 0 | 103 |

Or. Peel | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 131 | 1 | 132 |

Cracker | 5 | 1 | 1 | 0 | 0 | 0 | 22 | 2 | 0 | 133 | 164 |

∑ | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 1350 |

Class-wise accuracy of estimation in % | |||||||||||

95.56 | 91.11 | 79.26 | 98.52 | 67.41 | 80.00 | 37.78 | 51.85 | 97.04 | 98.52 |

OCF: | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

S/paper | Al. Foil | Sponge | Styrofoam | Corduroy | Linen | Br. Bread | Cotton | Or. Peel | Cracker | ∑ | |

S/paper | 34 | 0 | 21 | 26 | 0 | 2 | 7 | 10 | 2 | 7 | 109 |

Al. Foil | 0 | 112 | 1 | 1 | 2 | 1 | 7 | 4 | 0 | 6 | 134 |

Sponge | 30 | 0 | 43 | 10 | 8 | 3 | 17 | 4 | 13 | 12 | 140 |

Styrofoam | 28 | 0 | 15 | 61 | 3 | 17 | 7 | 13 | 0 | 7 | 151 |

Corduroy | 7 | 4 | 7 | 6 | 97 | 2 | 3 | 22 | 0 | 5 | 153 |

Linen | 2 | 0 | 0 | 2 | 1 | 91 | 4 | 10 | 2 | 3 | 115 |

Br. Bread | 16 | 6 | 39 | 13 | 8 | 4 | 58 | 5 | 7 | 33 | 189 |

Cotton | 3 | 0 | 1 | 5 | 1 | 3 | 1 | 61 | 6 | 0 | 81 |

Or. Peel | 7 | 0 | 1 | 0 | 1 | 1 | 1 | 4 | 105 | 1 | 121 |

Cracker | 8 | 13 | 7 | 11 | 14 | 11 | 30 | 2 | 0 | 61 | 157 |

∑ | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 1350 |

Class-wise accuracy of estimation in % | |||||||||||

25.19 | 82.96 | 31.85 | 45.19 | 71.85 | 67.41 | 42.96 | 45.19 | 77.78 | 45.19 |

Color LBP: | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

S/paper | Al. Foil | Sponge | Styrofoam | Corduroy | Linen | Br. Bread | Cotton | Or. Peel | Cracker | ∑ | |

S/paper | 66 | 0 | 27 | 1 | 11 | 4 | 8 | 6 | 0 | 25 | 148 |

Al. Foil | 0 | 134 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 134 |

Sponge | 28 | 0 | 65 | 1 | 5 | 0 | 14 | 1 | 28 | 26 | 168 |

Styrofoam | 0 | 0 | 0 | 126 | 0 | 30 | 0 | 6 | 0 | 2 | 164 |

Corduroy | 2 | 0 | 8 | 0 | 109 | 1 | 15 | 5 | 0 | 5 | 145 |

Linen | 0 | 1 | 0 | 1 | 0 | 69 | 0 | 36 | 0 | 0 | 107 |

Br. Bread | 13 | 0 | 20 | 1 | 6 | 7 | 86 | 1 | 6 | 27 | 167 |

Cotton | 0 | 0 | 0 | 2 | 1 | 20 | 0 | 72 | 5 | 3 | 103 |

Or. Peel | 5 | 0 | 15 | 0 | 2 | 0 | 1 | 0 | 95 | 0 | 118 |

Cracker | 21 | 0 | 0 | 3 | 1 | 4 | 11 | 8 | 1 | 47 | 96 |

∑ | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 1350 |

Class-wise accuracy of estimation in % | |||||||||||

48.89 | 99.26 | 48.15 | 93.33 | 80.74 | 51.11 | 63.70 | 53.33 | 70.37 | 34.81 |

RGB2G: | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

S/paper | Al. Foil | Sponge | Styrofoam | Corduroy | Linen | Br. Bread | Cotton | Or. Peel | Cracker | ∑ | |

S/paper | 26 | 1 | 17 | 38 | 2 | 11 | 9 | 7 | 24 | 5 | 140 |

Al. Foil | 1 | 81 | 0 | 0 | 0 | 1 | 3 | 0 | 0 | 8 | 94 |

Sponge | 22 | 2 | 33 | 13 | 5 | 11 | 16 | 5 | 26 | 14 | 147 |

Styrofoam | 37 | 1 | 12 | 45 | 0 | 11 | 1 | 8 | 8 | 13 | 136 |

Corduroy | 0 | 15 | 8 | 1 | 108 | 6 | 9 | 17 | 3 | 7 | 174 |

Linen | 1 | 5 | 1 | 4 | 3 | 56 | 10 | 2 | 1 | 6 | 89 |

Br. Bread | 10 | 6 | 22 | 7 | 3 | 9 | 50 | 2 | 10 | 37 | 156 |

Cotton | 11 | 2 | 4 | 3 | 5 | 12 | 3 | 77 | 12 | 4 | 133 |

Or. Peel | 24 | 1 | 32 | 19 | 3 | 10 | 7 | 16 | 51 | 6 | 169 |

Cracker | 3 | 21 | 6 | 5 | 6 | 8 | 27 | 1 | 0 | 35 | 112 |

∑ | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 135 | 1350 |

Class-wise accuracy of estimation in % | |||||||||||

19.26 | 60.00 | 24.44 | 33.33 | 80.00 | 41.48 | 37.04 | 57.04 | 37.78 | 25.93 |

## 6 Conclusion and Outlook

In this contribution we propose a prototype based framework for color texture classification. As an example we initialize the system with Gabor filters and classify color texture patterns in 15×15 patches randomly drawn from images of two public data sets. The results show that CIA-LVQ can learn typical texture patterns with very good generalization, even from relatively small patches and filter banks and it consistently outperforms state of the art techniques used for color texture analysis. It is also of conceptual value that this LVQ adaptation is suitable for learning in the complex number domain.

The resulting filter kernels may not strictly conform to the notion of Gabor filters, they preserve however the important property of symmetric and periodic excitatory and inhibitory regions, the shape and size of which are data driven. In principle every adaptive metric method could be extended following our suggestion, but we consciously choose LVQ because of its easily interpretable results and the lower computational costs in comparison to other approaches. Similarly to Gabor filters any other family of 2D filters commonly used to describe gray scale image information could be adapted and applied to color image analysis with this algorithm. Initializing with a filter bank of differences of Gaussians for color edge detection is a possible example. Furthermore, depending on the task at hand it might be desirable that two patches in which the same texture occurs on different positions should not be interpreted as similar. In this case another similarity measure should be used: \(\lVert|\mathbf {r}(\mathbf {x}^{i}) - \mathbf {r}(\mathbf {w}^{L}) |\rVert^{2}\), which is not based on the difference of magnitudes. This might be of advantage for example in the recognition of objects such as traffic signs, were a corner or an edge might have different interpretations dependent on their position in the image. Combinations of CIA-LVQ with keypoint detectors to avoid the drawing of patches from random positions within an image can also be easily implemented and can be beneficial especially for tasks that are related to object recognition. A completely unbiased, regarding the nature of the filters, variant of CIA-LVQ where the adaptive kernels are randomly initialized is also of particular interest mostly in cases where there is no prior knowledge for the nature of the data (i.e. medical imaging).

CIA-LVQ formulates a novel general principle: based on a differentiable convolution and an adaptive filter bank, the algorithm optimizes the classification. Contrary to standard approaches which are either based on a single channel representation of the images through a fixed transformation or empirical observations for combining color and textural information, the proposed technique offers the alternative of data driven learning of suitable, parameterized image descriptors. The ability of automatically weighing different color channels and different filters in localized neighborhoods, according to their importance for the classification task, is the most significant factor which qualifies our approach.

### Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.