Advertisement

Neural Computing and Applications

, Volume 30, Issue 3, pp 871–889 | Cite as

Human mimic color perception for segmentation of color images using a three-layered self-organizing map previously trained to classify color chromaticity

  • Farid García-Lamont
  • Jair Cervantes
  • Asdrúbal López-Chau
Original Article

Abstract

Most of the works addressing segmentation of color images use clustering-based methods; the drawback with such methods is that they require a priori knowledge of the amount of clusters, so the number of clusters is set depending on the nature of the scene so as not to lose color features of the scene. Other works that employ different unsupervised learning-based methods use the colors of the given image, but the classifying method employed is retrained again when a new image is given. Humans have the nature capability to: (1) recognize colors by using their previous knowledge, that is, they do not need to learn to identify colors every time they observe a new image and, (2) within a scene, humans can recognize regions or objects by their chromaticity features. Hence, in this paper we propose to emulate the human color perception for color image segmentation. We train a three-layered self-organizing map with chromaticity samples so that the neural network is able to segment color images by their chromaticity features. When training is finished, we use the same neural network to process several images, without training it again and without specifying, to some extent, the number of colors the image have. The hue component of colors is extracted by mapping the input image from the RGB space to the HSV space. We test our proposal using the Berkeley segmentation database and compare quantitatively our results with related works; according to the results comparison, we claim that our approach is competitive.

Keywords

Self-organizing maps Color classification Image segmentation Color spaces 

1 Introduction

Segmentation of images is an important research issue widely studied so as to extract and/or recognize objects depending on their features like texture, color, shape, among others. Depending on the nature of the problem, the color characteristics of the objects may be an important feature that provides relevant data about them. For instance, the segmentation of color images has been applied in different areas such as food analysis [1, 2], geology [3], medicine [4, 5, 6] among other areas [7, 8, 9, 10].

Works addressing the segmentation of color images apply different techniques [11, 12, 13], but the most employed are unsupervised neural networks (NNs) [14, 15, 16, 17, 18, 19, 20, 21] and methods based on clustering, specifically fuzzy C-means (FCM) [22, 23, 24, 25, 26, 27]. NN is trained just to recognize specific colors, i.e., they are trained with the colors of the given image to segment. If a new image is given, the NN must be trained again. By using cluster-based methods, clusters of colors with similar features are created. The drawback with such methods is that they require a priori knowledge of the amount of clusters; thus, the number of clusters is set depending on the nature of the scene so as not to lose color features of the scene.

Our proposal consists in to train a NN to recognize different colors, trying to mimic the human perception of color. Humans identify colors by their chromaticity as the main feature of color and then by its intensity [28]. For instance, if the reader is asked to tell which colors are the squares (a) and (b) of Fig. 1, it is almost sure that he/she would answer “green”; note that square (a) is brighter than square (b), but the chromaticity does not change; however, we can state that squares (a) and (b) are the same color but with different intensities. Now, if the reader is asked to tell which color are the squares (c) and (d) of Fig. 1, it is almost sure he/she would answer “red and pink, respectively.” The intensity of squares (c) and (d) is the same; the chromaticity difference between both squares is small, but we can notice that the colors of squares (c) and (d) are not the same, despite the fact that both squares have the same intensity.
Fig. 1

Color of squares (a) and (b) with the same chromaticity but with different intensities; color of squares (c) and (d) with different chromaticities but with the same intensity (color figure online)

By observing the environment, humans can recognize objects and/or regions within a scene by their chromaticity features. Thus, our approach is inspired by the fact that humans can recognize different parts of a scene by identifying the chromaticity of colors.

For instance, we can appreciate the different parts of the objects shown in Fig. 2 by their color features. Note that the pixels’ intensities of the images of row (a) are different, while the pixels’ intensity of the objects of images of row (b) is the same. However, the chromaticity in the images in both rows does not change; despite the intensity difference of colors of the objects of both rows, we can identify the parts that conform the objects by their color features; it is just necessary to homogenize the hue of the colors.
Fig. 2

Row (a) image samples of objects with different colors and intensities, row (b) images of the same objects but with the same intensity (color figure online)

Hence, it is enough, to some extent, to recognize the chromaticity so as to identify the different parts of an object or scene. As mentioned before, humans are capable to recognize the colors by the chromaticity features, independently of the intensity; this capability is emulated employing the HSV space because in this space the chromaticity is decoupled from the intensity [28]. So, the chromaticity can be processed without being affected by non-uniform illumination conditions. Thus, in order to segment an image, the pixels are labeled according to their hue; where pixels with similar hue are assigned the same label.

It is important to remark that human beings have the innate capability to recognize colors, and they do not need to learn to identify colors every time a new scene is shown; they just employ their knowledge acquired previously.

In order to emulate this human’s capability, we propose a NN with three layers of a self-organizing map (SOM), trained with chromaticity samples of different colors. The first layer performs a preliminary clustering of the input chromaticity data; the second layer, which is smaller in size, is trained on the basis that only the prominent colors are selected; it is important to remark that the number of neurons of each SOM defines the amount of colors the SOM can recognize.

If the size difference between both layers is huge, it is possible that not all the color features may be learnt by the second layer. To overcome such situation, we propose to add a third layer to the NN, such that its size harmonizes with the sizes of the other layers. That is, the sizes of the first, second and third layers should be huge, medium and small, respectively; this architecture lets us reduce the amount of colors to recognize without losing color features, as much as possible. The image is segmented according to the colors learnt by the third layer. The sizes of the layers must be defined such that the size proportions are adequate; in this work, we propose that the first, second and third SOMs have 256, 100 and 25 neurons, respectively. Thus, the NN recognizes up to 25 different colors. We give more details about the NN’s architecture in Sect. 3.2.

Our approach is alike to the proposal of [15]; however, the main differences are:
  1. 1.

    In [15], the NN is trained with the colors of the image to be processed, while we train our NN just once using chromaticity samples; when training is finished, any image can be processed using the NN. In [15], the NN must be trained for every given image.

     
  2. 2.

    They employ a two-layered SOM, where the first and second layers have 256 and 20 neurons, respectively, set on a 16 × 16 and 20 × 1 arrays; the size difference between both layers is large, so the knowledge acquired by the first layer may be lost.

     

Therefore, the contribution of this paper is a proposal for color image segmentation, emulating or mimicking the way human beings seem to recognize colors; humans recognize color mainly by the chromaticity feature; we design and apply a three-layered SOM, which is trained with chromaticity data of colors. After training, the NN is able to segment any color image, without training it again.

The paper is divided in the following sections: Section 2, related works, is reviewed; in Sect. 3, we explain how the chromaticity of colors is extracted from images and also the NN architecture and its training. Experiments and results are shown in Sect. 4; discussion of results is performed in Sect. 5; and finally, conclusions and future work are presented in Sect. 6.

2 Related works

The segmentation of color images has been addressed in different ways, where the cluster-based methods are often employed. A review of state of the art on color image segmentation is presented in the next paragraphs.

A fuzzy system designed with neuro-adaptive learning techniques is proposed in Ref. [14]; from a given image, the proposed system can reveal the probability of being a special color for each pixel through the image. The intensity of every pixel shows this probability in the gray-level output image. After selecting a threshold value, a binary image is computed, which can be used as a mask to segment desired color in the input image.

Ong et al. [15] present a two-stage hierarchical NN based on SOMs for color image segmentation. The first stage of the network uses a two-dimensional feature map, which captures the dominant colors of an image. The second stage employs a one-dimensional feature map to control the number of color clusters that is used for segmentation.

In [11] is introduced an algorithm called hill manipulation. From a given color image, it starts by segmenting the 3D color histogram into hills according to the number of local maxima found. Each hill is compared against defined criteria for possible splitting into more homogeneous smaller hills. The resulting hills undergo a post-processing task that filters out the small nonsignificant regions.

Ito et al. [28] propose a segmentation method that classifies images into different segments based on the human visual perception and achromatic color. The histograms of the image of hue, saturation and intensity are computed; three results from each histogram are obtained of the segmented image. The achromatic colors are considered in order to decrease the number of regions.

In Ref. [23], two improved FCM clustering algorithms with spatial constraints for color image segmentation are presented. In order to obtain spatial data of the pixels, the rank M-type and L-estimators are used. With these estimators, the local data of every color component in the RGB model (FCM_RMLrgb) are incorporated; to cover some limitations related to RGB model, the proposed approach is applied in the chromatic subspace in the IJK color space (FCM_RMLjk). Such estimators are involved in the FCM algorithm to provide robustness for the proposed segmentation techniques.

In [22], the multi-level low-rank approximation-based spectral clustering method is proposed to segment high-resolution images. The proposed method is a graph-theoretic approach, which finds natural groups in a given data set. It approximates the multi-level low-rank matrix, the approximations to the affinity matrix and its subspace, as well as those for the Laplacian matrix and the Laplacian subspace, and gains computational spatial efficiency.

Liu et al. [13] propose a segmentation method of mixture models of multivariate Chebyshev orthogonal polynomials (SMCh). This model is derived by the Fourier analysis, tensor product theory and the nonparametric mixture models of multivariate orthogonal polynomials. The mean integrated squared error is used to estimate the smoothing parameter for every model. The estimation of the number of density mixture components is solved employing the stochastic nonparametric expectation maximum algorithm, so as to compute the orthogonal polynomial coefficient and weight of each model.

In Ref. [29] is introduced a segmentation approach based on a Markov random field fusion model (MRFFM) which combines several segmentation results associated with simple clustering methods. The fusion model is based on the probabilistic Rand measure for comparing one segmentation result to one or more manual segmentations of the same image. This nonparametric measure lets to derive an appealing fusion model of label fields expressed as a Gibbs distribution. This Gibbs energy model encodes the binary constraints set given by the segmentation results to be fused.

In [9], a method for recognizing color points for automotive applications in the HSI space based on the distances between their projections onto the SI plane is presented. In the automotive context, it is analyzed which particular features of the HSI space are taken into consideration for segmentation and introduced a generic segmentation method based on intensity and saturation. The requirements for the classification of the points into those classes are obtained, where several weighting functions are employed and a fast form of an Euclidean metric is studied. The sensitivity of weighting function is improved using dynamic coefficients.

Reference [30] presents an algorithm based on the theory of gravity called stochastic-feature-based gravitational image segmentation algorithm (SGISA). The proposed algorithm employs color, texture and spatial data to partition the image. The algorithm is equipped with an operator called “scape” that is inspired by the concept of scape velocity in physics. A stochastic characteristic is incorporated into the algorithm, which gives it the ability to search the image for finding the fittest pixels that are suitable for merging.

Jiang and Zhou [16] propose a segmentation method based on ensemble of SOM, which clusters the pixels in an image according to color and spatial features with many SOMs and then combines the clustering results to give the final segmentation.

In [24] is introduced a clustering algorithm which maintains coherence of data in feature space; the algorithm works under the paradigm of clustering then labeling, named segmentation by clustering then labeling (SCL). Applied on the L*a*b* color space, the image is segmented by setting each pixel with its corresponding cluster. The algorithm is based on the theory of minimum description length, which is an effective approach to select automatically the parameters for the proposed segmentation method.

In Ref. [25] is proposed an algorithm where bilateral filtering is employed as a kernel function to form a pixonal image. The bilateral filtering is a preprocessing step that eliminates unnecessary details of the image and results in a few numbers of pixons. Later, the computed pixonal image is segmented using FCM.

Guo and Sengur [26] apply neutrosophic set, which studies the origin, nature and scope of neutralities. A directional α-mean operation, directional α-fuzzy C-means (DAFCM), is proposed to reduce the set indeterminacy; the FCM algorithm is improved by integrating with neutrosophic set and employed to segment the color image. The membership computation and the clustering termination are redefined accordingly.

In Ref. [31] is proposed a fusion model derived from the precision–recall criterion (FMDPC), dedicated to the clustering problem of spatially indexed data. The proposed framework is designed to be robust with respect to outlier segmentations, including an explicit internal regularization factor which reflects the inherent ill-posed nature of the segmentation problem. the consensus energy function related to this fusion model, which exploits a simple and deterministic iterative relaxation strategy combining the different segments belonging to the segmentation ensemble in the final solution is optimized.

Xue and Jia [32] segment the images by color map segmentation; the map is imported into the computer by scanning; the image appears cross-color and color noise. With the restriction of the print technology, there is some error in the map, so the traditional segmentation method is difficult to apply.

In [33] is introduced a semi-supervised clustering method based on modal analysis and mutational agglomeration algorithm in combination with the SOM. The modal analysis and mutational agglomeration are used for initial segmentation of the images. Then, the sampled pixels of the image are employed to train the SOM.

Khan and Jaffar [17] consider the image segmentation as a clustering problem, and a fixed length genetic algorithm (FLGA) is used to handle it. In fixed length genetic algorithm, the chromosomes have same length, which is normally set by the user. A SOM is used to determine the number of segments in order to set the length of a chromosome automatically.

In Ref. [18] is proposed a SOM with variable topology for image segmentation. The proposed network is a fast convergent network capable of color segmenting images, which has optimum self-adaptable topology.

Khan et al. [20] employ a spatial fuzzy genetic algorithm (SFGA) for the unsupervised segmentation of color images. The performance of SFGA is influenced by the number of clusters, which should be known in advance, and by the initialization of cluster centers. These issues are overcome by a progressive technique based on SOM to find the optimal number of clusters automatically; with respect to the initialization problem, peaks are identified using the image color histograms.

In Ref. [34] is proposed a scale estimate of SOM; it determines the number of nodes of competition layer by 3D spatial distribution of pixels in HSV color space. Then sample the pixels to train the map topology of the image and segment pixels by computing similarity between their feature vectors with weights of each node.

A modified version of FCM algorithm is presented in Ref. [21], where the spatial information is incorporated into the membership function for clustering of color images. A technique based on SOM is used to automatically find the number of optimal clusters.

Most of the reviewed works employ FCM-based methods; as mentioned before, the drawback with these methods is that the number of clusters must be defined a priori. Other works use unsupervised NN, mainly SOMs, but the NN employed is trained every time a novel image is given. That is, a NN trained with the colors of a given image cannot always recognize all the colors of a different image; hence, the NN must be trained with the colors of the new image. Also, under this approach, the computing load of the image processing may be huge, depending on the sizes of both the image and the NN.

In the following section, we introduce our proposal for image segmentation, which is based on a three-layered SOM trained to recognize the chromaticity of colors. Despite the fact that our NN is unsupervised, it is not mandatory to define a priori the amount of clusters, the number of colors our NN can recognize depends on the amount of neurons. So, when a given color is fed to the NN, it just excites the neuron whose color learned to recognize is equal or similar. Besides, because of the way we train our proposed NN, we can employ it to any image, without training again the NN.

3 Proposed approach

Although the RGB space is accepted by most of the image processing community to represent colors, humans do not perceive colors as represented in such space. The human perception of colors resembles the color representation of HSV space [28]; hence, we employ such space but also because the training set for our NN is significantly smaller than if we use other space.

The classification of colors is not trivial; because of the fuzzy nature of colors, it is not possible to establish exactly the limits between colors. However, it is possible to cluster the colors according to the hue features; for instance, pink hues can be grouped with red hues or cyan hues with green hues. This is the reason cluster-based or unsupervised methods are often employed, because these methods group the color in clusters by finding the underlying relation of the data. Hence, we propose a three-layered SOM to classify colors, where the number of colors the NN can recognize depends on the number of neurons of the NN, especially in the third layer. The idea is that each neuron of the NN learns to recognize a specific color, and it gets excited only when a given color is equal or alike to the one which it was trained to identify.

Our proposal consists of the steps shown in the flowchart of Fig. 3. The input image acquired in the RGB is mapped to the HSV space; the hue component of every pixel is extracted, and then, the pixel hue is processed by the NN, trained previously with chromaticity samples. The hue, orientation of the winning neuron’s weight vector, is assigned to the corresponding pixel, and also the pixel is labeled with the number of the winning neuron; after processing the pixels of the entire image, the resulting image is mapped to the RGB space, and finally the segmented image is obtained.
Fig. 3

Flowchart of our proposal for image segmentation

In Sect. 3.1, we describe the RGB space and how RGB color vectors are mapped to the HSV space and vice versa; in Sect. 3.2, we introduce the architecture of the NN we propose and how it is trained; in Sect. 3.3, we explain how the color of a pixel is processed.

3.1 RGB and HSV color spaces

The RGB space is based on the Cartesian coordinate system where colors are points defined by vectors that extend from the origin [35], where black is located in the origin and white in the opposite corner to the origin, as shown in Fig. 4.
Fig. 4

RGB color space

The color of a pixel p is a linear combination of the basis vectors red, green and blue, which can be written as:
$$\phi_{p} = r_{p} \hat{i} + g_{p} \hat{j} + b_{p} \hat{k}$$
(1)
where the scalars r, g and b are the red, green and blue, respectively, components of the color vector; the range of each component is \(\left[ {0,255} \right] \subset {\mathbb{R}}\). The orientation and magnitude of a color vector define the chromaticity and the intensity of the color, respectively [35].

This space is not uniform, and the color representation is not the way the humans perceive color [35]. For color classification, the clustering methods are not adequate for this space because the difference between two colors cannot be simply measured by their Euclidean distance [15]; even if two vectors with the same orientation but with different magnitudes represent different colors, despite the fact that they have the same chromaticity. For example, in Fig. 1 the color vectors of squares (a) and (b) have the same orientation, but they do not have the same magnitude; therefore, in this space, they are represented as different colors.

In order to mimic the human perception of color, we employ the uniform color space HSV; several works state that the representation of color using the HSV color space emulates the human perception of color because the chromaticity is decoupled from the intensity [9, 28, 35]. The HSV space is cone shaped, as shown in Fig. 5.
Fig. 5

HSV color space

The color of a pixel p in the HSV space has the elements hue (h), saturation (s) and value (v), that is:
$$\varphi_{p} = \left[ {h_{p} ,s_{p} ,v_{p} } \right]$$
(2)
where hue is the chromaticity, the saturation is the distance to the glow axis of black–white and the value is the intensity. Black is located at the cone’s tip and white at the center of the base at the bottom of the cone. The chromaticity is distributed around the circumference of the cone; hence, the hue is in the range \(\left[ {0,2\pi } \right] \subset {\mathbb{R}}\); saturation is in the real range [0, 1], while value is often in the range \(\left[ {0,255} \right] \subset {\mathbb{R}}\).
Usually, the colors of the acquired images are represented in the RGB space; thus, so as to process the images under our approach, firstly the colors of the image are mapped to the HSV space. The mapping of the RGB color vector ϕ = [rgb] to the HSV space is performed by using the following equations. Let \(\Delta = \hbox{max} \left( {r,g,b} \right) - \hbox{min} \left( {r,g,b} \right)\):
$$h = \left\{ {\begin{array}{*{20}l} {{\text{non-defined}},} \hfill & {\hbox{max} \left( {r,g,b} \right) = \hbox{min} \left( {r,g,b} \right)} \hfill \\ {\frac{{\pi \left( {g - b} \right)}}{{3\Delta }},} \hfill & {\hbox{max} \left( {r,g,b} \right) = r,g \ge b} \hfill \\ {\frac{{\pi \left( {g - b} \right)}}{{3\Delta }} + 2\pi ,} \hfill & {\hbox{max} \left( {r,g,b} \right) = r,g < b} \hfill \\ {\frac{{\pi \left( {b - r} \right)}}{{3\Delta }} + \frac{2}{3}\pi ,} \hfill & {\hbox{max} \left( {r,g,b} \right) = g} \hfill \\ {\frac{{\pi \left( {g - b} \right)}}{{3\Delta }} + \frac{4}{3}\pi ,} \hfill & {\hbox{max} \left( {r,g,b} \right) = b} \hfill \\ \end{array} } \right.$$
(3)
$$s = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {\hbox{max} \left( {r,g,b} \right) = 0} \hfill \\ {1 - \frac{{\hbox{min} \left( {r,g,b} \right)}}{{\hbox{max} \left( {r,g,b} \right)}},} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.$$
(4)
$$v = \hbox{max} \left( {r,g,b} \right)$$
(5)
Once the image is processed in the HSV space, it is mapped back to the RGB space because most of the hardware employed to display digital images employ the RGB space. Mapping a HSV color vector φ = [hsv] to the RGB space involves the following operations.
$$r = \left\{ {\begin{array}{*{20}l} {q,} \hfill & {k = 1} \hfill \\ {p,} \hfill & {2 \le k \le 3} \hfill \\ {t,} \hfill & {k = 4} \hfill \\ {v,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.$$
(6)
$$g = \left\{ {\begin{array}{*{20}l} {t,} \hfill & {k = 0} \hfill \\ {v,} \hfill & {1 \le k \le 2} \hfill \\ {q,} \hfill & {k = 3} \hfill \\ {p,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.$$
(7)
$$b = \left\{ {\begin{array}{*{20}l} {p,} \hfill & {0 \le k \le 1} \hfill \\ {t,} \hfill & {k = 2} \hfill \\ {v,} \hfill & {3 \le k \le 4} \hfill \\ {q,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.$$
(8)
where
$$k = \lfloor\frac{3}{\pi }h\rfloor$$
(9)
$$f = \frac{3}{\pi }h - k$$
(10)
$$p = v \times \left( {1 - s} \right)$$
(11)
$$q = v \times \left( {1 - \left( {s \times f} \right)} \right)$$
(12)
$$t = v \times \left( {1 - \left( {s \times \left( {1 - f} \right)} \right)} \right)$$
(13)

In the next Sect. 3.2, we present the architecture of the NN we propose, and how the NN is trained using the hue data of the colors.

3.2 Neural network architecture and training

The size of the NN depends on the amount of colors to recognize. It is not possible to recognize all the colors of the spectrum because of the fuzzy nature of color. But the colors resembling hue can be grouped and then the color spectrum “divided” in a finite amount of colors. If a SOM with a large number of neurons is employed, the segmentation can be poor; that is, if the chromatic difference between two colors is small, then these two colors can be assigned to different groups. On the other hand, if a SOM with a few neurons is used, two colors can be assigned to the same group despite the fact that their chromaticity difference is huge. To solve these circumstances, we propose the following.

If we connect the output of a SOM with several neurons to the input of a SOM with less neurons, we obtain a NN that is able to keep the background knowledge of the first SOM and at the same time to reduce the number of groups. So as not to lose chromaticity features, the size difference between the layers should not be large. Hence, we propose the size of the first layer is huge, the second layer is smaller, approximately half in size. But the second layer may still be large; thus, we propose to add another SOM as third layer, which is smaller in size than the second layer, approximately half in size.

Therefore, the architecture of the NN we use is: The first and second layers are two SOM with 256 and 100 neuron sets on a 16 × 16 and 10 × 10 array, respectively. The size of the third layer defines the number of colors the NN recognizes; we recommend the third has 25 neurons set on a 5 × 5 array; the size of the third layer is proposed after performing experiments with different sizes; we found that the most adequate size is 25 neurons set on a 5 × 5 array. Resembling the architecture proposed in [15], they use two layers where the second layer has 20 neurons set on a 20 × 1 array SOM.

For the training stage of the NN, a training set is built with samples of chromaticity. As mentioned before, the hue is in the interval 0 ≤ h ≤ 2π, but the chromaticity is transformed into a vector because of the special case when the hue is almost 0 or 2π. That is, considering squares (c) and (d) of Fig. 1, their hue values are π/100 and 19π/10, respectively. Numerically, both values are very different, but the chromaticity of both squares is alike; if we classify the chromaticity of both squares only by their scalar hue values, for this case, the chromaticity will be recognized as if they were very different.

To overcome this problem, we transform the hue value into a two-element vector, where the magnitude of such vector is 1 and its orientation is the hue data. Let \(\varphi_{p} = \left[ {h_{p} ,s_{p} ,v_{p} } \right]\) be the color of pixel p in the HSV space; the chromaticity is modeled as:
$$\psi_{p} = \left[ {\cos \,h_{p} ,\sin \,h_{p} } \right]$$
(14)
With this chromaticity representation, the problem mentioned before is solved. Given the new representation of chromaticity, the training set is built with chromaticity samples as follows:
$$\Psi = \left\{ {\psi_{k} = \left[ {\cos \theta_{k} ,\sin \theta_{k} } \right]|\,\theta_{k} = \frac{2\pi }{256}k:k = 0,1, \ldots ,255} \right\}$$
(15)

The number of samples we employ is defined considering the SOM architecture we use; the idea is that each neuron of the SOM of the first layer gets excited just by one element of the training set.

The SOM is a kind of competitive or unsupervised NN; it is based on finding the winning neuron before an external stimuli. In other words, the output neurons with weights w k compete between them so as to find the best match with the external pattern. We employ the Euclidean distance to measure the match between neuron w k and the external pattern x p . The neuron k whose weight vector w k is the closest to x p is declared the winner. The winner and all the neurons within a neighborhood are updated using the Kohonen learning rule [36].

The three-layered SOM is trained as follows: let SOM 1, SOM 2 and SOM 3 be the first, second and third layers, respectively, of the NN. The SOM 1 is trained with the elements of set \(\Psi\).

The SOM 2 is trained with the set \({\varPsi \bigcup }W^{1}\), where \(W^{1} = \left\{ {{\mathbf{w}}_{k}^{1} |k = 1, \ldots ,256} \right\}\) is the set of neurons’ weighting vectors of SOM 1 after training. Let \(W^{2} = \left\{ {{\mathbf{w}}_{k}^{2} |k = 1, \ldots ,100} \right\}\) be the set of neurons’ weighting vectors of SOM 2 after training; similarly, the SOM 3 is trained with the set \({\varPsi \bigcup }W^{2}\). Figure 6 shows the color features maps for the three layers after training.
Fig. 6

ac Color feature maps of the first, second and third layer, respectively, after training (color figure online)

3.3 Color processing

As we have stated before, the colors are processed by extracting their chromaticity, but it is important to mention that the NN cannot recognize black nor white color because they do not have a specific chromaticity. White is obtained when the saturation of the color is low, i.e., when s ≈ 0; on the other hand, black is obtained when the intensity of the color is low, that is, when v ≈ 0. Our NN cannot recognize black nor white because it is trained with chromaticity data; thus, before a color is processed by the NN, we evaluate its saturation and value so as to classify it as black or white. To process the color of a pixel involves the following steps.

The color vector of the pixel represented in the RGB space is mapped to the HSV space; the intensity of the color is analyzed to determine whether the color is black. If the color of the pixel is not black, then the saturation of the color is analyzed to determine whether the color is white. If the color is neither black nor white, then the color of the pixel is a chromaticity. The chromaticity of the pixel is extracted with Eq. (14); the vector computed ψ is fed to the NN. The weight vector of the wining neuron of the third layer is obtained. The pixel is labeled with the number of the winning neuron; the hue of the pixel is set by computing the orientation of the winning neuron’s weight vector. The resulting color vector represented in the HSV space is mapped to the RGB space. Algorithm 1 resumes the steps mentioned above. Open image in new window where δ s and δ v are the thresholds for saturation and value, respectively. Due to the fuzzy nature of color, there are not specific numeric values to decide exactly when a color is black or white. Experimentally we found that the best values for the thresholds are δ s  = μ s  − σ s and δ v  = μ v  − σ v, where μ s and μ v are the mean of the saturation and intensity values of the image, respectively; σ s and σ v are the standard deviation of the saturation and intensity values of the image, respectively. In order to enhance the colors, we set the saturation and the value to 1 and 191, respectively.

In the following section, we show the experiments performed using as benchmark the Berkeley segmentation database; we compare the performance of our approach, measured with certain metrics, with other related works.

4 Experiments and results

The performance evaluation of color image segmentation algorithms has been subjective [37], but recently different metrics have been proposed in order to evaluate the algorithms. There are two schools of thought for algorithm evaluation; the first proposes that the algorithms must be evaluated within the context of the particular task. The second proposes that the algorithms must be evaluated depending on their performance with respect to a defined ground truth [38].

The second school thought has been adopted in several related works, where the Berkeley segmentation database1 (BSD) is becoming the standard benchmark to test algorithms for color image segmentation. The BSD is a database of natural images for which the ground truth is known. The BSD contains 300 color images of size 481 × 321 pixels; for each of these images, the database provides between 4 and 9 human segmentations in the form of label maps. Therefore, we employ the BSD as benchmark for experiments so as to compare our results with other approaches.

According to the reviewed papers, despite the fact that several metrics have been proposed, apparently there have not been already defined absolute metrics to evaluate the algorithms quantitatively, but we have observed that in different papers [17, 22, 23, 24, 29, 30, 31]; the Probabilistic Rand Index (PRI) and variation of information (VOI) are becoming the standard metrics for qualitative evaluation. Thus, we employ these metrics to evaluate our algorithm.

The PRI compares the image obtained from the tested algorithm to a set of manually segmented images. Let {I 1, …, I m } and S be the ground truth set and the segmentation provided by the tested algorithm, respectively. \(L_{i}^{{I_{k} }}\) is the label of pixel x i in the kth manually segmented image, and \(L_{i}^{S}\) is the label of pixel x i in the tested segmentation. The PRI index is computed with:
$${\text{PRI}}\,\left( {S,I_{k} } \right) = \frac{2}{{n\left( {n - 1} \right)}}\mathop \sum \limits_{i,j,i < j} \left( {p_{i,j}^{{c_{i,j} }} \left( {1 - p_{i,j} } \right)^{{1 - c_{i,j} }} } \right)$$
(16)
where n is the number of pixels, c i,j is a Boolean function: c i,j  = 1 if \(L_{i}^{{I_{k} }} = L_{j}^{S}\), c i,j  = 0 otherwise; p i,j is the expected value of the Bernoulli distribution for the pixel pair.
The VOI index measures the sum of loss of information and the gain between two clusters belonging to the lattice of possible partitions in the following way:
$${\text{VOI}}\,\left( {S,I_{k} } \right) = H\left( S \right) + H\left( {I_{k} } \right) - 2F\left( {S,I_{k} } \right)$$
(17)
where H is the entropy \(- \mathop \sum \nolimits_{i = 1}^{c} \frac{{n_{i} }}{n}\log \frac{{n_{i} }}{n}\), n i being the number of points belonging to the ith cluster, c is the number of clusters, and F is the mutual information between two clusters defined as:
$$F\,\left( {S,I_{k} } \right) = \mathop \sum \limits_{i = 1}^{c} \mathop \sum \limits_{j = 1}^{c} \frac{{n_{i,j} }}{n}\log \frac{{n_{i} n_{j} }}{{n^{2} }}$$
(18)
where n i,j is the number of points in the intersection of cluster i of S and j of I k . The ranges of the PRI and VOI metrics are [0, 1] and [0, ∞), respectively. The higher the value of PRI, the better the segmentation is; similarly, the lower the value of VOI the better the segmentation is.

The experiments we perform are divided into the following Sects. 4.1 and 4.2; in Sect. 4.1, we use a NN with the architecture described in Sect. 3.2, while in Sect. 4.2 we employ a NN whose third layer has 4 neurons. In the related works, the number of clusters the users set is from 2 to 5 clusters; in other words, the number of colors they recognize is between 2 and 5. Thus, in order to emulate this number we make tests using a NN with 4 neurons in the last layer, set on a 2 × 2 array. In both sections, we process the same images used in [23] by applying both NNs, and we compare our results with the ones obtained by [16] and [23] because they report to obtain high accuracy. It is important to remark that the NNs for the following experiments are trained just once, trained as explained in Sect. 3.2, and we employ them to process all the images.

4.1 Experiments with NN with 25 neurons in the third layer

Table 1 shows the images employed in [23] and also the ones we use for our experiments. As mentioned before, we evaluate our algorithm and compare the results with the ones obtained using the approaches of Refs. [16, 23] using the PRI and VOI metrics.
Table 1

Images employed for experiments; original images taken from the Berkeley image database

Table 2 shows the resulting images using the images of Table 1, by applying our proposal and the methods FCM_RMLrgb, FCM_RMLjk and SMCh.
Table 2

Results after processing the images of Table 1

The appearances of all the resulting images resemble those obtained by the other methods, except the images 42049, 176035, 101084 and 210088, where the segmentation of the images is visibly different.

We can appreciate from the images obtained after processing the image 42049 with the other methods that the eagle and the branches are segmented from the background; while using the NN, different kinds of colors of the eagle, the branch and the background are segmented. The segmentation in image 176035 is not clear, and the green hues within the image are alike; thus, it is difficult for the NN to identify the color differences, mainly the hues of the pixels that correspond to the sky.

The images obtained from image 101084 using the SMCh method and our approach are similar, although with our proposal the figure within the scene is segmented from the background. In the background, despite the fact that the main hue is green, there are different levels of green; thus, the background is not totally homogeneous, as shown in the image obtained with the FCM_RMLjk method. In the resulting image by processing the image 210008 with the NN, there are more colors than the images obtained with the other methods. With our proposal, the fish is segmented from the background; the images obtained with the methods FCM_RMLrgb and SMCh do not segment the fish from the background. With the method FCM_RMLjk, the fish is separated from the background, but the background loses color features.

Table 3 shows the average performance obtained per method by using the metrics VOI and PRI.
Table 3

Average performance obtained per method using the images of Table 1

Method

VOI

PRI

FCM_RMLrgb

1.801

0.796

FCM_RMLjk

0.873

0.864

SMCh

2.273

0.739

Our proposal

2.3922

0.8301

Best performances are given in italics

The results obtained with the metrics are alike. According to the metrics, the best segmentation is achieved using the FCM_RMLjk method; with the PRI metric, our proposal is the second best, but with the VOI metric, we obtain the lowest result.

4.2 Experiments with NN with 4 neurons in the third layer

As mentioned before, in this section we show the results obtained by using a NN whose third layer has 4 neurons. Table 4 shows the images obtained after processing the images of Table 1 with the second NN.
Table 4

Images obtained after processing the images of Table 1 with the second NN

It is easy to observe that the segmentation of several images is poorer than the images obtained with the first NN. But in some images the segmentation is improved. For instance, in image 42049 the eagle and the branches have the same color, but some parts of the branches are black, and most of the background is white. In image 67079, the color of the building is homogeneous, that is, the color is some kind of green and the background is almost blue. In image 80099, the hue of the area around the bull is homogeneous with some kind of green color, while the bull is segmented with white and black.

The image 210088 is alike to the image obtained with the methods FCM_RMLrgb and SMCh of the previous section, where the color of the background of this image is more homogeneous, but the fish is also mixed with the background; thus, the silhouette of the fish cannot be appreciated. It happens almost the same with images 113044, 124084 and 196073 where the images have the same color, that is, the color of objects within the scene is the same color of the background; hence, the objects cannot be appreciated.

The image 35070 has the same appearance of the images obtained in the section before, except for the leaf where the insect stands. In image 35010, the segmentation is not totally successful because the color of the butterfly’s wings is the same color of the leaves around the butterfly.

The segmentation of images 135069, 232038, 167062 and 101084 is almost the same we obtained in Sect. 4.1. The average performances measured with the metrics are also lower than the obtained in the previous section, see Table 5.
Table 5

Average performances obtained per metric using the images of Table 1, processed with the second NN

 

VOI

PRI

Average

2.6295

0.7725

Due to the small size of the third layer, several colors, although chromatically, are different; the NN identifies them as if they were the same. For instance, in image 113044 despite the color of horses is red and the background is green; these colors are classified as the same class; hence, the segmentation fails. In other images, although the colors are not assigned adequately or do not keep their chromaticity, the segmentation is more precise; for example, image 42049. In general, the performance is lower than the obtained in Sect. 4.1, as listed in Table 5.

5 Discussion

From the results obtained in Sects. 4.1 and 4.2, the best image segmentation is obtained with the first NN; these results resemble those obtained with the other methods. But the precision of the segmentation results using the second NN is lower. In Sect. 5.1, we explain why the first NN works better than the second NN. Also, in Sect. 5.2 we analyze an adequate size for segmentation by comparing the performance of neural networks with architectures and sizes alike to our proposal. In Sect. 5.3, we compare the performance obtained with our proposal with related works by processing all the images of the BSD.

5.1 Performance of the neural network

The image segmentation using our proposal is not so precise when the chromaticity features of the different colors within the scene are alike. For instance, the image 196073 of Table 2 is difficult to segment because of the camouflage effect; that is, the colors of the objects within the scene are almost the same. Even for humans it is difficult to identify objects when the camouflage effect is present. Although the chromaticity of both the snake and the sand is alike, the snake is segmented from the background by the first NN. With the second NN, the snake is not segmented from the background because the second NN recognizes just four colors, see image 196073 of Table 4; hence, the small difference between the chromaticity of the snake and the sand is not distinguished by the second NN.

Other image with a similar situation is the image 35070. The leaf and the background have almost the same chromaticity; hence, these two parts may be confused by the segmentation methods as if they were the same object. Using the first NN and the methods FCM_RMLrgb and FCM_RMLjk, the leaf and the background are segmented as different objects, while with the second NN the segmentation of both the leaf and the background does not success; with the method SMCh, the segmentation of these two objects is not totally defined.

In image 113044 of Table 2, the horses are segmented as if they were the same object because they have the same chromaticity, but the horses are segmented successfully according to the ground truth established in the BSD. The camouflage effect is present in image 101084 because of the jungle environment of the background; most of the background is green, but there are different hues of green, such that despite the fact that the similarities of hue several objects are segmented successfully. The area around the bull in image 80099 is almost homogeneous; there are some pixels with different kinds of green hues. The image 167062 is segmented basically in two parts, the white and black pixels, but the wolf is segmented as if it were part of the white area.

The first NN works better for images or scenes that contain colors chromatically different; in images 35010, 67079, 118035, 124084, 135069, 232038 and 210088, all or most of the color segmented areas are homogeneous. Although it is difficult, to some extent, for this NN to segment images where the chromaticities of colors are very similar. For example, the image 176035 is not segmented successfully because most of the pixels have almost the same green hue; however, as mentioned before, the image 196073 is segmented as expected. On the other hand, the results obtained with the metrics resemble those obtained with the other methods, see Table 3.

With the second NN, the segmentation quality of some images is reduced, as we can observe from the results obtained with the metrics but also the appearance of the resulting images. Despite several colors are chromatically different, the second NN recognizes them as if they were the same. For instance, in image 113044 of Table 4, the horses’ color is red and the background is green; however, these colors are classified as if they were the same. Therefore, the segmentation does not success. In other images, although the colors assigned are not adequate, the segmentation or pixel labeling is more precise; for example in image 80099 of Table 4, the color of the area around the bull is slightly more homogeneous than the area obtained with the first NN, see Table 2.

It is easy to appreciate that the segmentation is better using the NN employed for experiments of Sect. 4.1 than the NN used for experiments in Sect. 4.2. The image segmentation with the second NN does not always success; because of the small classification space, several colors are clustered in the same group despite the fact that they are chromatically very different, as we have explained before in image 113044 of Table 4. In some images, the segmentation is precise; for example, images 67079, 135069 and 80099 of Table 4. But if more colors are present in the scene, the segmentation may not be successful.

Thus, the performance of the first NN is better than the second NN, according to the metrics results. The second NN is useful if the amount of colors within an image is small; if the amount of colors of an image is huge, then the NN cannot segment the image adequately. The first NN is able to segment images successfully with a large or small amount of colors within a scene. Its performance falls if the chromaticities of the image’s colors are very similar.

5.2 Proposed size of the NN

As mentioned before, if the size difference between layers is large, the training background of previous layers may be lost. With one layer and a few neurons, not alike colors are grouped in the same cluster; with one layer and many neurons, it is difficult to obtain homogeneous regions. To show this, we test NN with different architectures alike to the NN we propose. The first NN is a 5 × 5 neuron array SOM (NN-25); the second NN has two layers; the first layer and second layer are 16 × 16 and 10 × 10 neuron array SOM (NN-256-100); the third NN has two layers; the first layer and second layer are 16 × 16 and 5 × 5 neuron array SOM (NN-256-25); the fourth NN is a 16 × 16 neuron array SOM (NN-256). Table 6 shows the images obtained by processing the images of Table 1 using the NN with different architectures we have described.
Table 6

Images obtained after processing the images of Table 1 with NN with different architectures

The images of the first, second, third and fourth columns are obtained with the neural networks NN-25, NN-256-100, NN-256-25 and NN-256, respectively.

The appearances of the resulting images of Table 6 resemble those obtained with our NN proposed. However, there are details in the segmentation that make our NN more precise. For instance, the butterfly of image 35010 is segmented from the background, but with the other NN the color of the butterfly’s wings is not homogeneous; the red hue of horses of image 113044 is not homogeneous; none of these NN segments the snake of image 196073 from the sand; the color of the sky of images obtained with the architectures NN-256-100 and NN-256 of image 135069 is not homogeneous; the resulting images by processing the images 42049 and 176035 are almost the same. In image 80099, the chromaticity of the area around the bull using NN-256 is almost homogeneous, like the image we obtain with our NN proposed; the resulting images using the other architectures are notably that the color of the same area is not homogenous.

Table 7 shows the results obtained using the metrics VOI and PRI with the NN we described in this section.
Table 7

Results obtained with metrics VOI and PRI with neural networks NN-25, NN-256-100, NN-256-25, NN-256 and our proposal

Image

Neural network architecture

NN-25

NN-256-100

NN-256-25

NN-256

NN-256-100-25

VOI

PRI

VOI

PRI

VOI

PRI

VOI

PRI

VOI

PRI

35010

2.4510

0.8266

2.4684

0.8027

2.4128

0.8174

2.5941

0.7989

2.3230

0.8585

42049

3.7975

0.5292

3.7821

0.5330

3.8976

0.5333

3.8658

0.5331

3.9368

0.5168

67079

2.9638

0.7367

3.1124

0.7235

3.5671

0.7182

3.2504

0.7253

2.9226

0.7500

80099

2.6717

0.7879

2.5679

0.7298

2.3218

0.8171

2.4321

0.8495

2.2788

0.8860

113044

2.7197

0.7095

2.6489

0.7109

2.5942

0.7131

2.7649

0.7103

2.6208

0.7284

118035

2.2769

0.8029

2.3143

0.8043

2.2356

0.8114

1.9869

0.8216

2.1512

0.8704

124084

2.7099

0.7543

2.5490

0.7763

2.7874

0.7542

2.4631

0.7632

2.2427

0.8123

135069

1.9876

0.9884

2.1650

0.9201

2.0148

0.9890

1.9903

0.9451

1.8036

0.9899

167062

1.8041

0.9649

1.9983

0.9640

1.9781

0.9659

1.8874

0.9635

1.7688

0.9746

176035

2.9058

0.7322

2.9114

0.7314

2.8983

0.7375

2.9046

0.7290

2.8899

0.7417

196073

2.5347

0.6593

2.2101

0.7427

2.2891

0.7463

2.1640

0.7474

1.5859

0.9496

232038

2.9143

0.7816

2.8669

0.7817

2.8791

0.7826

2.8602

0.7806

2.8531

0.7868

101084

2.1950

0.7823

2.2116

0.7751

2.1394

0.7873

2.1571

0.7838

2.0563

0.8025

35070

2.3514

0.8066

2.3041

0.8406

2.3487

0.8152

2.2396

0.8421

2.0246

0.9565

210088

2.6107

0.7153

2.5233

0.7301

2.5901

0.7228

2.5123

0.7223

2.4243

0.8262

Average

2.5930

0.7719

2.5756

0.7711

2.5970

0.7108

2.5382

0.7811

2.3922

0.8301

Figure 7 shows graphically the values obtained with the PRI metric given in Table 7.
Fig. 7

Graphic of the PRI metric values obtained with neural networks NN-25, NN-256-100, NN-256-25, NN-256 and our proposal

Figure 8 shows graphically the values obtained with the VOI metric given in Table 7.
Fig. 8

Graphic of the VOI metric values obtained with neural networks NN-25, NN-256-100, NN-256-25, NN-256 and our proposal

Considering the results of Table 7, Figs. 7 and 8, the NN with the highest performance is the one we propose. The second highest is the neural network NN-256; it does not always obtain homogeneous areas, like in images 135069 or 113044. As claimed before, if the NN has many neurons, the colors with similar chromaticity can be grouped into different clusters. The neural network NN-256-25 has the lowest performance; a plausible reason is that the difference in size of both layers is large, so some of the training background of the first layer is lost. In some cases, the colors with different hues are grouped into the same cluster; for instance, the leaf in image 35070 is not totally segmented from the background; the dome of the church in image 115035 is segmented with two different kinds of red hues, or the columns of the building shown in image 67079 have two kinds of yellow hues.

The neural networks NN-25 and NN-256-100 have almost the same performance, and their performances are higher than neural network NN-256-25. We can see that in some images the segmentation is better with the neural network NN-256-100 in images 113044, 124084, 196073 and 35070.

5.3 Qualitative comparison with previous works

In Refs. [16, 23], there are reported results obtained with the images shown in Table 1, the whole BSD is not processed. We process all the images of the BSD, and we compare our results, obtained with the neural network NN-256-100-25, with related works that also process all the images of the BSD and employ the VOI and/or PRI metrics. The average rates obtained with our approach and the reported ones by related works are listed in Table 8.
Table 8

Average rates comparison between related works and our proposal by processing all the images of the Berkeley image database

Method

Metric

VOI

PRI

DAFCM [26]

0.7720

MRFFM [29]

0.8006

FLGA [17]

1.9239

0.8332

SFGA [20]

1.9182

0.7852

SCL [24]

1.9690

0.7583

SGISA [30]

3.4799

0.7252

FMDPC [31]

2.0100

0.8000

Our proposal

2.4672

0.7989

Best performances are given in italics

In Refs. [26, 29], there are just reported results using the PRI metric. According to the results of Table 8, our proposal does not obtain the highest values with respect to the ones reported in related works. However, the values obtained with our approach are close to the values reported in the works with the highest values. Hence, we can claim that our proposal is competitive.

Our approach obtains low performance usually in images with parts in white, black or if the chromaticity of colors is very similar. As mentioned before, the NN cannot recognize neither white nor black because they are not chromaticities. In order to segment white and black areas within the images, the values of saturation and intensity are threshold, respectively; these threshold values are computed as described in Sect. 3.2. However, due to the fuzzy nature of color, there is not a specific threshold value to determine whether a color is white or black; hence, some parts in white or black are not segmented correctly.

Figure 9 shows two examples of images where black and white colors are not recognized successfully, leading to imprecise segmentation. On the other hand, if the chromaticities of the different sections of the image are very similar, the NN segments them as if they were the same chromaticity; thus, such areas are segmented as if they were an only area, as we have discussed in Sect. 5.1.
Fig. 9

Examples of images where sections in black or white are not segmented correctly

6 Conclusions and future work

We have introduced an artificial neural network with three layers of self-organizing maps for segmentation of color images. The human perception of color is emulated, where humans recognize colors mainly by the chromaticity. Samples of chromaticity data of colors are employed to train the neural network; later, a given image is processed by extracting the chromaticities of the image’s colors and fed to the neural network.

According to our experiments, the recommend sizes for the self-organizing maps of the first, second and third layers are 256, 100 and 25 set on 16 × 16, 10 × 10 and 5 × 5 arrays, respectively. The quantitative values obtained with the Probabilistic Rand Index and Variation of Information metrics, using the images of the Berkeley Segmentation Database, are alike to the results reported in related work references.

It is important to remark the following about our proposal:
  1. 1.

    With our proposal, it is not necessary to know a priori the number of colors the image contains. Due to the usual methods for segmentation of color, images are cluster-based or unsupervised-based, as the fuzzy C-means, the user gives the number of clusters the image is segmented; that is, such methods demand to define the number of groups before the algorithm runs. On the other hand, despite the fact that the self-organizing maps are a kind of unsupervised-based method, it is not mandatory to specify the number of regions the image must be segmented. The number of neurons the neural network has is the number of colors the neural network can recognize; thus, the image is segmented depending on the colors the neural network can recognize. The neurons are excited when a given color is the same or similar to the one they learnt to recognize.

     
  2. 2.

    The neural network we employ is trained just once; then, it can be employed to segment any new image given. With the usual approaches, the classification methods are trained to recognize the colors of a specific image; if a new image is given, the classification method must be trained again.

     
  3. 3.

    The quantitative comparison between the current proposal and related works, see Table 8, shows that our approach is competitive.

     

As future work, it is necessary to employ information of intensity and/or saturation of colors in order to avoid the camouflage effect but also to establish a gray scale. In this work, we employ only two levels: black and white; a fuzzy logic-based approach is suitable to solve this problem. We have shown that with a neural network trained just one time, it can be employed to process any image without training it again, as the cluster-based methods usually do when a new image is given.

Footnotes

Notes

Acknowledgements

This work was sponsored by Secretaría de Educación Pública: convenio PROMEP/103.5/13/6535. We thank Francisco Gallegos Funes for his valuable help and support.

References

  1. 1.
    Gökmen V, Sügüt I (2007) A non-contact computer vision based analysis of color in foods. Int J Food Eng 3(5). doi: 10.2202/1556-3758.1129
  2. 2.
    Lopez JJ, Cobos M, Aguilera E (2011) Computer-based detection and classification of flaws in citrus fruits. Neural Comput Appl 20(7):975–981CrossRefGoogle Scholar
  3. 3.
    Lepistö L, Kuntuu I, Visa A (2005) Rock image classification using color features in Gabor space. J Electron Imaging 14(4):1–3CrossRefGoogle Scholar
  4. 4.
    Ghoneim DM (2011) Optimizing automated characterization of liver fibrosis histological images by investigating color spaces at different resolutions. Theor Biol Med Model 8:25CrossRefGoogle Scholar
  5. 5.
    Harrabi R, Braiek EB (2012) Color image segmentation using multi-level thresholding approach and data fusion techniques: application in the breast cancer cells images. EURASIP J Image Video Process 2012:11. doi: 10.1186/1687-5281-2012-11 CrossRefGoogle Scholar
  6. 6.
    Lingala M, Stanley RJ, Rader RK, Hagerty J, Rabinovitz HS, Oliveiro M, Choudhry I, Stoecker WV (2014) Fuzzy logic color detection: blue areas in melanoma dermoscopy images. Comput Med Imaging Graph 38(5):403–410CrossRefGoogle Scholar
  7. 7.
    Wang F, Man L, Wang B, Xiao Y, Pan W, Lu X (2008) Fuzzy-based algorithm for color recognition of license plates. Pattern Recognit Lett 29(7):1007–1020CrossRefGoogle Scholar
  8. 8.
    del Fresno M, Macchi A, Marti Z, Dick A, Clausse A (2006) Application of color image segmentation to estrusc detection. J Vis 9(2):171–178CrossRefGoogle Scholar
  9. 9.
    Rotaru C, Graf T, Zhang J (2008) Color image segmentation in HSI space for automotive applications. J Real Time Image Process 3(4):311–322CrossRefGoogle Scholar
  10. 10.
    Bianconi F, Fernández A, González E, Saetta SA (2013) Performance analysis of colour descriptors for parquet sorting. Expert Syst Appl 40(5):1636–1644CrossRefGoogle Scholar
  11. 11.
    Aghbarii ZA, Haj RA (2006) Hill-manipulation: an effective algorithm for color image segmentation. Image Vis Comput 24(8):498–903CrossRefGoogle Scholar
  12. 12.
    Mignotte M (2014) A non-stationary MRF model for image segmentation from a soft boundary map. Pattern Anal Appl 17(1):129–139MathSciNetCrossRefGoogle Scholar
  13. 13.
    Liu Z, Song YQ, Chen JM, Xie CH, Zhu F (2012) Color image segmentation using nonparametric mixture models with multivariate orthogonal polynomials. Neural Comput Appl 21(4):801–811CrossRefGoogle Scholar
  14. 14.
    Mousavi BS, Soleymani F, Razmjooy N (2013) Color image segmentation using neuro-fuzzy system in a novel optimized color space. Neural Comput Appl 23(5):1513–1520CrossRefGoogle Scholar
  15. 15.
    Ong S, Yeo N, Lee K, Venkatesh Y, Cao D (2002) Segmentation of color images using a two-stage self-organizing network. Image Vis Comput 20(4):279–289CrossRefGoogle Scholar
  16. 16.
    Jiang Y, Zhou ZH (2004) SOM ensemble-based image segmentation. Neural Process Lett 20(3):171–178CrossRefGoogle Scholar
  17. 17.
    Khan A, Jaffar MA (2015) Genetic algorithm and self organizing map based fuzzy hybrid intelligent method for color image segmentation. Appl Soft Comput 32:300–310CrossRefGoogle Scholar
  18. 18.
    Araujo A, Costa DC (2009) Local adaptive receptive field self-organizing map for image color segmentation. Image Vis Comput 27(9):1229–1239MathSciNetCrossRefGoogle Scholar
  19. 19.
    Stephanakis IM, Anastassopoulos GC, Iliadis LS (2010) Color segmentation using self-organizing feature maps (SOFMs) defined upon color and spatial image space. In: Artificial neural networks—ICANN 2010, lecture notes on computer science (LNCS), vol 6352, pp 500–510Google Scholar
  20. 20.
    Khan A, Ullah J, Jaffar MA, Choi TS (2014) Color image segmentation: a novel spatial fuzzy genetic algorithm. Signal Image Video Process 8(7):1233–1243CrossRefGoogle Scholar
  21. 21.
    Khan A, Jaffar MA, Choi TS (2013) SOM and fuzzy based color image segmentation. Multimed Tools Appl 64(2):331–344CrossRefGoogle Scholar
  22. 22.
    Wang L, Dong M (2012) Multi-level low-rank approximation-based spectral clustering for image segmentation. Pattern Recognit Lett 33(16):2206–2215CrossRefGoogle Scholar
  23. 23.
    Mújica-Vargas D, Gallegos-Funes FJ, Rosales-Silva AJ (2013) A fuzzy clustering algorithm with spatial robust estimation constraint for noisy color image segmentation. Pattern Recognit Lett 34(4):400–413CrossRefGoogle Scholar
  24. 24.
    Huang R, Sang N, Luo D, Tang Q (2011) Image segmentation via coherent clustering in L*a*b* color space. Pattern Recognit Lett 32(7):891–902CrossRefGoogle Scholar
  25. 25.
    Nadernejad E, Sharifzadeh S (2013) A new method for image segmentation based on fuzzy c-means algorithm on pixonal images formed by bilateral filtering. Signal Image Video Process 7(5):855–863CrossRefGoogle Scholar
  26. 26.
    Guo Y, Sengur A (2013) A novel color image segmentation approach based on neutrosophic set and modified fuzzy c-means. Circuits Syst Signal Process 32(4):1699–1723MathSciNetCrossRefGoogle Scholar
  27. 27.
    Kim JY (2014) Segmentation of lip region in color images by fuzzy clustering. Int J Control Autom Syst 12(3):652–661CrossRefGoogle Scholar
  28. 28.
    Ito S, Yoshioka M, Omatu S, Kita K, Kugo K (2006) An image segmentation method using histograms and the human characteristics of HSI color space for a scene image. Artif Life Robot 10(1):6–10CrossRefGoogle Scholar
  29. 29.
    Mignotte M (2010) Penalized maximum rand estimator for image segmentation. IEEE Trans Image Process 19(6):1610–1624MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Rashedi E, Nezamabadi-pour H (2013) A stochastic gravitational approach to feature based color. Eng Appl Artif Intell 26(4):1322–1332CrossRefGoogle Scholar
  31. 31.
    Mignotte M, Hélou C (2014) A precision–recall criterion based consensus model for fusing multiple segmentations. Int J Signal Process Image Process Pattern Recognit 7(3):61–82Google Scholar
  32. 32.
    Xue A, Jia C (2009) A new method of color map segmentation based on the self-organizing neural network. In: Emerging intelligent computing technology and applications. With aspects of artificial intelligence, lecture notes on artificial intelligence (LNAI), vol 5755, pp 417–423Google Scholar
  33. 33.
    Halder A, Dalmiya S, Sadhu T (2014) Color image segmentation using semi-supervised self-organizing feature map. Adv Signal Process Intell Recognit Syst 264:591–598CrossRefGoogle Scholar
  34. 34.
    Sima H, Guo P, Liu L (2011) Scale estimate of self-organizing map for color image segmentation. IEEE Int Conf Syst Man Cybern 1491–1495. doi: 10.1109/ICSMC.2011.6083882
  35. 35.
    Gonzalez RC, Woods RE (2002) Digital image processing, 2nd edn. Prentice Hall, Englewood CliffsGoogle Scholar
  36. 36.
    Kohonen T (1990) The self-organizing map. Proc IEEE 78(9):1464–1480Google Scholar
  37. 37.
    Zhang H, Fritts JE, Goldman SA (2008) Image segmentation evaluation: a survey of unsupervised methods. Comput Vis Image Underst 110(2):260–280CrossRefGoogle Scholar
  38. 38.
    Estrada FJ, Jepson AD (2009) Benchmarking image segmentation algorithms. Int J Comput Vis 85(2):167–181CrossRefGoogle Scholar

Copyright information

© The Natural Computing Applications Forum 2016

Authors and Affiliations

  • Farid García-Lamont
    • 1
  • Jair Cervantes
    • 1
  • Asdrúbal López-Chau
    • 2
  1. 1.Universidad Autónoma del Estado de México, Centro Universitario UAEM TexcocoTexcoco-Estado de MéxicoMexico
  2. 2.Universidad Autónoma del Estado de México, Centro Universitario UAEM ZumpangoZumpango-Estado de MéxicoMexico

Personalised recommendations