Abstract
In this paper, a new algorithm using directional local extrema patterns meant for content-based image retrieval application is proposed. The standard local binary pattern (LBP) encodes the relationship between reference pixel and its surrounding neighbors by comparing gray-level values. The proposed method differs from the existing LBP in a manner that it extracts the directional edge information based on local extrema in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions in an image. Performance is compared with LBP, block-based LBP (BLK_LBP), center-symmetric local binary pattern (CS-LBP), local edge patterns for segmentation (LEPSEG), local edge patterns for image retrieval (LEPINV), and other existing transform domain methods by conducting four experiments on benchmark databases viz. Corel (DB1) and Brodatz (DB2) databases. The results after being investigated show a significant improvement in terms of their evaluation measures as compared with other existing methods on respective databases.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Nowadays we can get very good digital scanners and cameras to store the sweet memories of world tour. With these cameras a huge amount of images are coming out. However, we cannot access or make use of these data unless they are organized so as to allow efficient browsing, searching, and retrieval. Hence, there is a dire need of some search technique viz content-based image retrieval (CBIR). The feature extraction is a prominent step and the effectiveness of a CBIR system depends typically on the method of features extraction from raw images. The CBIR utilizes the visual contents of an image such as color, texture, shape, faces, spatial layout, etc. to represent and index the image. The visual features can further be classified into general features which include color, texture, shape, and domain-specific features as human faces and finger prints. There exists no single best representation of an image for all perceptual subjectivity, because the user may take the photographs in different conditions (view angle, illumination changes etc.). Comprehensive and extensive literature survey on CBIR is presented in [1, 2, 3, 4].
Texture analysis has attracted a great deal of attention due to its potential values for computer vision and pattern recognition applications. This branch of texture analysis is particularly well suited for identification of products such as ceramic tiles, marble, parquet slabs, etc. This has been an eye catcher for researchers. Ahmadian et al. used the wavelet transform for texture classification [5]. The discrete wavelet transform (DWT) based texture image retrieval using generalized Gaussian density and Kullback–Leibler distance can be seen in [6]. Texture classification and segmentation by use of wavelet frames is observed in [7]. However, DWT can extract the features only in three directions (horizontal, Vertical and diagonal). Hence, Gabor transform (GT) [8], rotated wavelet filters [9], and the combination of dual tree complex wavelet filters (DT-CWF) [10], dual tree rotated complex wavelet filters (DT-RCWF) have been proposed in literature to extract the more directional features which are absent in DWT. Manjunath et al. [8] proposed the Gabor transform (GT) for image retrieval on Bordatz texture database taking into account the mean and standard deviation features from four scales and six directions of GT. Texture image retrieval by calculation of characteristics of image in different directions has been achieved using rotated wavelet filters [9], and the combination of DT-CWF, DT-RCWF and also rotational invariant DT-RCWF is proposed by Kokare et al. [9, 10, 11], respectively.
Now, a concise review of the related literature available, targeted for development of our algorithm, is given here. The local binary pattern (LBP) features are designed for description of texture. Ojala et al. [12] proposed the LBP, which are converted to rotational invariant for texture classification [13]. Pietikainen et al. proposed the rotational invariant texture classification using feature distributions [14]. Ahonen et al. [15] and Zhao and Pietikainen [16] used the LBP operator for facial expression analysis and recognition. Heikkila et al. proposed the background modeling and detection using LBP [17]. Huang et al. [18] proposed the extended LBP for shape localization. Heikkila et al. [19] used the LBP for interest region description. Li and Staunton [20] used the combination of Gabor filter and LBP for texture segmentation. Zhang et al. [21] proposed local derivative patterns (LDP) for face recognition, where they considered LBP as non-directional first-order local patterns collected from the first-order derivatives of an image. The block-based texture feature which uses the LBP texture feature as the source of image description is proposed in [22] for CBIR. The center-symmetric local binary pattern (CS-LBP) which is a modified version of the well-known LBP feature is combined with scale invariant feature transform (SIFT) in [23] for description of interest regions. Yao et al. [24] have proposed two types of local edge patterns (LEP) histograms: one is LEPSEG for image segmentation, and the other is LEPINV for image retrieval. The LEPSEG is sensitive to variations in rotation and scale; on the contrary, the LEPINV is resistant to variations in rotation and scale.
It has already been proved that the directional features are very valuable for image retrieval applications [9, 10, 11]. But, the above-discussed various extensions of LBP features are non-directional features. To address this problem, in this paper we propose the directional local extrema patterns (DLEP) for image retrieval and the main contributions of this work are given in the next subsection.
1.1 Main contributions
The main contributions of this work are summarized as follows:
-
1.
The DLEP is proposed in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions in contrast to LBP. This DLEP differs from the existing LBP in a manner that it extracts the directional edge information based on local extrema.
-
2.
The performance of the proposed method is tested on benchmark image databases.
The paper is systematized as follows: In Sect. 1, a brief review of CBIR and related work is given. Section 2 presents a concise review of local pattern operator. The proposed system framework and query matching are illustrated in Sect. 3. Experimental results and discussions are given in Sect. 4. Based on the above work, conclusions and future scope are derived in Sect. 5.
2 Local patterns
2.1 Local binary patterns (LBP)
The LBP operator was introduced by Ojala et al. [12] for texture classification. Success in terms of speed (no need to tune any parameters) and performance is reported in many research areas such as texture classification [12, 13, 14], face recognition [15, 16], object tracking, bio-medical image retrieval, and finger print recognition. Given a center pixel in the 3\(\times \)3 pattern, LBP value is computed by comparing its gray-scale value with its neighborhoods based on Eqs. (1) and (2):
where \(I(g_c )\) denotes the gray value of the center pixel, \(I(g_p)\) represents the gray value of its neighbors, \(P\) stands for the number of neighbors, and \(R\) the radius of the neighborhood.
After computing the LBP pattern for each pixel \((j, k)\), the whole image is represented by building a histogram as shown in Eq. (3).
where the size of input image is \(N_1 \times N_2 \).
Figure 1 shows an example of obtaining an LBP from a given \(3\times 3\) pattern. The histograms of these patterns contain the information on the distribution of edges in an image.
2.2 Center-symmetric local binary patterns (CS_LBP)
Instead of comparing each pixel with the center pixel, Heikkila et al. [23] have compared center-symmetric pairs of pixels for CS_LBP as shown in Eq. (5):
After computing the CS_LBP pattern for each pixel \((j, k)\), the whole image is represented by building a histogram, as similar to the LBP.
2.3 Directional local extrema patterns (DLEP)
The idea of LBP proposed in [12] has been adopted to define directional local extrema patterns (DLEP). DLEP describes the spatial structure of the local texture using the local extrema of center gray pixel\(g_c \).
In proposed DLEP for a given image the local extrema in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions are obtained by computing local difference between the center pixel and its neighbors as shown below:
The local extremas are obtained by Eq. (7).
The DLEP is defined (\(\alpha =0^{\circ }, 45^{\circ }, 90^{\circ }\), and \(135^{\circ })\) as follows:
The detailed representation of DLEP can be seen in Fig. 2.
Eventually, the given image is converted to DLEP images with values ranging from 0 to 511.
After calculation of DLEP, the whole image is represented by building a histogram supported by Eq. (10).
where the size of input image is \(N_1 \times N_2 \).
The DLEP computation for a center pixel marked with red color has been illustrated in Fig. 2. The local difference between the center pixel and its eight neighbors are used to evaluate the directions as shown in Fig. 2. Further, these directions are utilized to obtain DLEP patterns in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions. The selected 3\(\times \)3 pattern for DLEP calculation is represented with subscripts (0) to (8) as shown in Fig. 2.
An example of the DLEP computation in 0\(^{\circ }\) direction for a center pixel marked with red color has been illustrated in Fig. 3. For a center pixel ‘6’ we apply the local extrema in horizontal direction and then it is seen that these two directions are leaving from the center pixel; hence this pattern is coded to ‘1’. Similarly, we computed the remaining bits of DLEP from other 8 neighbors, and the resulting pattern is ‘1 1 1 0 1 1 1 0 1’. In the same fashion, DLEP patterns for center pixel in the directions 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) are also computed.
The proposed DLEP is different from the well-known LBP. The DLEP encodes the spatial relation between any pair of neighbors in a local region along a given direction, while LBP [12] extracts relation between the center pixel and its neighbors. Therefore, DLEP captures more spatial information as compared with LBP. It has already been proved that the directional features are very valuable for image retrieval applications [9, 10, 11].
Figure 4 illustrates the results obtained by applying LBP and DLEP operators on referenced face image. Face image is chosen as it provides the results which are visibly comprehensible to differentiate the effectiveness of these approaches. From Fig. 4, it is observed that the DLEP yields more directional edge information as compared with LBP. The experimental results demonstrate that the proposed DLEP shows better performance as compared with LBP, indicating that it can capture more edge information than LBP for texture extraction.
3 Proposed system framework
3.1 Proposed image retrieval system
Figure 5 depicts the flowchart of the proposed technique and algorithm for the same is presented here:
Algorithm:
Input: Image; Output: Retrieval result
-
1.
Load the gray-scale image
-
2.
Calculate the local extrema in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions.
-
3.
Compute the DLEP patterns in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions.
-
4.
Construct the histograms for DLEP patterns in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions.
-
5.
Construct the feature vector by concatenating all histograms.
-
6.
Compare the query image with the image in the database using Eq. (11).
-
7.
Retrieve the images based on the best matches
3.2 Query matching
Feature vector for query image \(Q\) is represented as\(f_Q =(f_{Q_1 } ,f_{Q_1 } ,\ldots \ldots f_{Q_{Lg} } )\) obtained after the feature extraction. Similarly, each image in the database is represented with feature vector \(f_{\mathrm{ DB}_j } =(f_{\mathrm{ DB}_{j1} } ,f_{\mathrm{ DB}_{j1} } ,\ldots f_{\mathrm{ DB}_{jLg} } );\,j=1,2, \ldots ,\left|\text{DB} \right|\). The goal is to select \(n\) best images that resemble the query image. This involves selection of \(n\) top matched images by measuring the distance between query image and image in the database \(\left|\text{DB} \right|\). In order to match the images we used \(d_{1}\) similarity distance metric computed by Eq. (11).
where \(f_{\mathrm{ DB}_{ji} } \)is \(i\)th feature of \(j\)th image in the database \(\left|{\text{ DB}} \right|\).
4 Experimental results and discussions
The retrieval performance of the proposed method has been analyzed by conducting four experiments on two different databases (Corel-10K (DB1), and Brodatz database (DB2)) and results are presented separately.
In experiment #1, #2, and #3, images from Corel database [25] have been used. This database consists of large number of images of various contents ranging from animals to outdoor sports to natural images. These images have been pre-classified into different categories each of size 100 by domain professionals. Some researchers think that Corel database meets all the requirements to evaluate an image retrieval system, due its large size and heterogeneous content.
In all experiments, each image in the database is used as the query image. For each query, the system collects \(n\) database images \(X=(x_{1}, x_{2}, \ldots , x_{n})\) with the shortest image matching distance computed using Eq. (11). If the retrieved image \(x_{i}=1, 2, \ldots , n\) belongs to same category as that of the query image then we say the system has appropriately identified the expected image; else, the system fails to find the expected image.
The performance of the proposed method is measured in terms of average precision, average recall, and average retrieval rate (ARR) as shown below:
For the query image \(I_q \), the precision is defined as follows:
where ‘\(n\)’ indicates the number of retrieved images, \(\left|{\text{DB}} \right|\) is size of image database. \(\Phi ( x)\) stands for the category of ‘\(x\)’, \(\text{ Rank}(I_i ,I_q )\) returns the rank of image \(I_i \) (for the query image\(I_q )\) among all images of \(\left| {\text{DB}} \right|\) and \(\delta ( {\Phi ( {I_i }),\Phi ( {I_q })})=\left\{ {\begin{array}{l@{\quad }l} 1&\Phi ( {I_i })=\Phi ( {I_q }) \\ 0&\text{Otherwise} \\ \end{array}} \right.\).
Recall is defined as below:
The average precision for the \(j\)th similarity category of the reference image database are given by Eq. (14).
Finally, the total average precision, and ARR for the whole reference image database are computed using Eqs. (15) and (16), respectively
The average recall \((R)\) is also defined in the same manner.
4.1 Experiment #1
For this experiment, we have collected 1000 images to form database Corel-1K. These images are collected from ten different domains, namely Africans, beaches, buildings, buses, dinosaurs, elephants, flowers, horses, mountains, and food. Each category has \(N_{G}\) (100) images with resolution of either \(256\times 384\) or \(384\times 256\). Figure 6 shows the sample images of Corel-1K database (one image from each category). The performance of the proposed method is measured in terms of average precision, average recall, and ARR as shown in Eqs. (12–16).
Tables 1 and 2 show the results of proposed method and other existing methods (LBP, CS_LBP, LEPSEG, LEPINV, BLK_LBP) in terms of precision and recall. The results are considered to be better if average values of precision and recall are high.
From Tables 1 and 2, the following points are observed:
-
1.
The average precision of proposed method (74.8%) is more as compared with LBP (71.2%), CS_LBP (59.1%), LEPSEG (65.2%), LEPINV (60.8%), and BLK_ LBP (70.1%).
-
2.
The average recall of proposed method (49.16%) is more as compared with LBP (45.71%), CS_LBP (40.9%), LEPSEG (38.1%), LEPINV (34.68%), and BLK_LBP (43.0%).
From the above observations, it is evident that the proposed method significantly improves results in terms of average precision and average recall. Figure 7a, b show the experimental results of proposed method and other existing methods. It is observed that the proposed method (DLEP) achieves a superior average precision and ARR on image database Corel-1K as compared with other existing methods.
4.2 Experiment #2
In this experiment, we have used 5000 images to form database of Corel-5K. This database consists of 50 different categories and each category contains 100 images. The performance of the proposed method is measured in terms of average precision, average recall, and ARR as shown in Eqs. (12–16).
Table 3 illustrates the retrieval results of proposed method and other existing methods on Corel-5K and Corel-10K databases in terms of average precision and recall. Figure 8a, b show the category-wise performance of methods in terms of precision and recall on Corel-5K database. The performance of all techniques in terms of average precision and ARR on Corel-5K database can be seen in Fig. 8c, d, respectively. From Table 3 and Fig. 8, it is clear that the proposed method shows a significant improvement as compared with other existing methods in terms of their evaluation measures on Corel-5K database. Figure 9 illustrates the query results of proposed method on Corel-5K database (top left image is the query image).
4.3 Experiment #3
In experiment #3, we have used 10,000 images to form database of Corel-10K. This database consists of 100 different categories and each category contains 100 images. The performance of the proposed method is measured in terms of average precision, average recall, and ARR as shown in Eqs. (12–16).
Figure 10a, b show the category-wise performance of methods in terms of precision and recall on Corel-10K database. The performance of all techniques in terms of average precision and ARR on Corel-10K database can be seen in Fig. 10c, d, respectively. From Table 3 and Fig. 10, it is clear that the proposed method shows a significant improvement as compared with other existing methods in terms of their evaluation measures on Corel-10K database. Figure 11 illustrates the query results of proposed method on Corel-10K database (top left image is the query image).
4.4 Experiment #4
In experiment #4 the database DB2 is used, that consists of 116 different textures. We have used 109 textures from Brodatz texture photographic album [26] and seven textures from University of Southern California (USC) database [27]. The size of each texture is 512\(\times \)512. Each 512\(\times \)512 image is divided into sixteen 128\(\times \)128 non-overlapping sub-images, thus creating a database of 1856 (116\(\times \)16) images. In this experiment, each image in the database is considered as the query image and the performance of the proposed method is measured in terms of ARR as given by Eq. (17).
The database DB2 is used to compare the performance of the proposed method (DLEP) with other existing methods (GT, DT-CWT, DT-RCWT, DT-CWT+DT-RCWT CS_LBP, LEPSEG, LEPINV, BLK_LBP, and LBP) in terms of ARR. From Table 4, it is evident that the proposed is outperforming other existing methods. Figure 12a, b show the graphs which illustrates the retrieval performance of proposed method and other existing methods as a function of number of top matches, and we find that the proposed method outperforms the other existing methods in terms ARR.
5 Conclusions and future work
A new approach for CBIR is presented in this paper. The proposed DLEP differs from the existing LBP in a manner that it extracts the directional edge information based on local extrema in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions in an image. Performance of the proposed method is tested by conducting four experiments on benchmark image databases and retrieval results show a significant improvement in terms of their evaluation measures as compared with other existing methods on respective databases.
Further this work can be extended by combining proposed method with GT and by varying the number of neighbors (more directions) of referenced pixels.
References
Rui Y, Huang TS (1999) Image retrieval: current techniques, promising directions and open issues. J Vis Commun Image Represent 10:39–62
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Kokare M, Chatterji BN, Biswas PK (2002) A survey on current content based image retrieval methods. IETE J Res 48(3&4):261–271
Liu Y, Zhang D, Lu G, Ma W-Y (2007) A survey of content-based image retrieval with high-level semantics. J Pattern Recognit 40:262–282
Ahmadian A, Mostafa A (2003) An efficient texture classification algorithm using Gabor wavelet. In: 25th annual international conference of the IEEE EMBS, pp 930–933, Cancun, Mexico
Do MN, Vetterli M (2002) Wavelet-based texture retrieval using generalized Gaussian density and Kullback-leibler distance. IEEE Trans Image Process 11(2):146–158
Unser M (1993) Texture classification by wavelet packet signatures. IEEE Trans Pattern Anal Mach Intell 15(11):1186–1191
Manjunath BS, Ma WY (1996) Texture features for browsing and retrieval of image data. IEEE Trans Pattern Anal Mach Intell 18(8):837–842
Kokare M, Biswas PK, Chatterji BN (2007) Texture image retrieval using rotated wavelet filters. J Pattern Recognit Lett 28:1240–1249
Kokare M, Biswas PK, Chatterji BN (2005) Texture image retrieval using new rotated complex wavelet filters. IEEE Trans Syst Man Cybernet 33(6):1168–1178
Kokare M, Biswas PK, Chatterji BN (2006) Rotation-invariant texture image retrieval using rotated complex wavelet filters. IEEE Trans Syst Man Cybernet 36(6):1273–1282
Ojala T, Pietikainen M, Harwood D (1996) A comparative sudy of texture measures with classification based on feature distributions. J Pattern Recognit 29(1):51–59
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Pietikainen M, Ojala T, Scruggs T, Bowyer KW, Jin C, Hoffman K, Marques J, Jacsik M, Worek W (2000) Overview of the face recognition using feature distributions. J Pattern Recognit 33(1):43–52
Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: applications to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041
Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928
Heikkiala M, Pietikainen M (2006) A texture based method for modeling the background and detecting moving objects. IEEE Trans Pattern Anal Mach Intell 28(4):657–662
Huang X, Li SZ, Wang Y (2004) Shape localization based on statistical method using extended local binary patterns. In: Proc Int Conf Image and Graphics, pp 184–187
Heikkila M, Pietikainen M, Schmid C (2009) Description of interest regions with local binary patterns. J Pattern Recognit 42:425–436
Li M, Staunton RC (2008) Optimum Gabor filter design and local binary patterns for texture segmentation. J Pattern Recognit 29:664–672
Zhang B, Gao Y, Zhao S, Liu J (2010) Local derivative pattern versus local binary pattern: Face recognition with higher-order local pattern descriptor. IEEE Trans Image Process 19(2):533–544
Takala V, Ahonen T, Pietikainen M (2005) Block-based methods for image retrieval using local binary patterns. SCIA 2005, LNCS, vol 3450, pp 882–891
Heikkil M, Pietikainen M, Schmid C (2009) Description of interest regions with local binary patterns. Pattern Recognit 42:425– 436
Yao C-H, Chen S-Y (2003) Retrieval of translated, rotated and scaled color textures. Pattern Recognit 36:913–929
Corel-1 K image database. [Online]. http://wang.ist.psu.edu/docs/rela-ted.shtml
Brodatz P (1996) Textures: a photographic album for artists and designers. Dover, New York
University of Southern California, Signal and Image Processing Institute, Rotated Textures. [Online]. http://sipi.usc.edu/database/
Acknowledgments
This work was supported by the Ministry of Human Resource and Development, India under grant MHR-02-23-200 (429). Our sincere thanks to Mr. Anil Balaji Gonde, Dr. Dinesh Kumar Rajoriya and Mr. Pratul Arvind (Research Scholars), Department of Electrical Engineering, Indian Institute of Technology Roorkee, Roorkee for their valuable technical discussions during this work. We would like to thank the editor and anonymous reviewers for insightful comments and helpful suggestions to improve the quality, which have been incorporated in this manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Murala, S., Maheshwari, R.P. & Balasubramanian, R. Directional local extrema patterns: a new descriptor for content based image retrieval. Int J Multimed Info Retr 1, 191–203 (2012). https://doi.org/10.1007/s13735-012-0008-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-012-0008-2