Granulation-Based Reverse Image Retrieval for Microscopic Rock Images

Habrat, Magdalena; Młynarczuk, Mariusz

doi:10.1007/978-3-030-50420-5_6

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12139))

Included in the following conference series:

International Conference on Computational Science

2153 Accesses

Abstract

The paper presents a method of object detection on microscopic images of rocks, which makes it possible to identify images with similar structural features of the rock. These features are understood as the sizes and shapes of its components and the mutual relationships between them. The proposed detection methodology is an adaptive and unsupervised method that analyzes characteristic color clusters in the image. It achieves good detection results for rocks with clear clusters of colored objects. For the analyzed data set, the method finds in the rock image sets with high visual similarity, which translates into the geological classification of rocks at a level of above 78%. Considering the fact that the proposed method is based on segmentation that does not require any input parameters, this result should be considered satisfactory. In the authors’ opinion, this method can be used in issues of rock image search, sorting, or e.g. automatic selection of optimal segmentation techniques.

You have full access to this open access chapter, Download conference paper PDF

Search of visually similar microscopic rock images

Article Open access 27 November 2014

Methods and Algorithms of Image Recognition for Mineral Rocks in the Mining Industry

Algorithms of the Cluster and Morphological Analysis for Mineral Rocks Recognition in the Mining Industry

Keywords

1 Introduction

Recent years we have seen a dynamic technological progress in the field of computer processing and recognition of images. The developed methods find further applications, which include earth sciences. The active development of IT methods, initiated in the nineties of the last century, resulted in a situation where the basis of many measurements in geology are images and digital image sequences obtained from e.g. optical, electron, confocal microscopes, etc. [1,2,3,4,5]. Carrying out automatic measurements on images means that the researchers have access to constantly increasing image databases. It necessitates their automatic interpretation and that, in turn, requires their automatic indexation. Currently in geology, such data sets are managed manually using specialized knowledge. However, in the case of very large collections that often contain hundreds of thousands of images, this approach is difficult to implement due to the huge amount of time that it requires. This situation necessitates the use of methods that would allow for automatic management of image data sets [6, 7]. These include, among others, image search techniques, which have been intensively developed in recent years [8,9,10,11,12]. These methods are also the subject of the research described in this work. The paper presents research for an unsupervised retrieval system, where the query is in the form of an image (or its fragment) and no additional conditions are explicitly defined. This approach is defined as image search, and also referred to as reverse image search [13].

1.1 The Idea of Image Retrieval Systems

Image retrieval systems belong to the group of techniques that support the automatic management of image data sets [6,7,8, 10, 11, 14, 15]. Two main approaches to image search have become particularly popular in the literature.

The first (which is not the subject of this work) involves finding images based on the so-called metadata, which can be both technical parameters of the image as well as verbal descriptions, e.g. regarding the content of the image. This type of image search systems is often referred to as TBIR (Text Based Image Retrieval).

The other mainstream image search is based on image content analysis and is often referred to as CBIR, CBVIR (Content Based (Visual) Image Retrieval) [16, 17]. This approach largely reflects the way in which images are compared by the human mind, referring to the content of images that differ in colors, texture and content. Such an image can be characterized by certain characteristics. In order to detect these features, it is necessary to determine optimal and possible universal methods for obtaining descriptors (understood as a numerical representation of image features). In the search issue, both the search query and the set of searched images are processed to extract the feature vector. Then, these features are compared and adjusted according to established criteria. In a nutshell, one can distinguish two main trends in the construction of image search systems. These trends differ in the way they determine the similarity between the data. The first is the so-called unsupervised search, otherwise referred to as “without interaction”, based on unsupervised machine learning methods [18,19,20]. The other trend is the so-called supervised search, otherwise defined - “with interaction” or feedback, based on supervised machine learning methods (relevance feedback) [21,22,23]. Regardless of the selection of the similarity determination methodology, in the reverse image search problem it is important that the feature detection process be as effective as possible.

1.2 Initial Description of the Presented Method

Due to the cognitive nature of geological research and the constantly growing resources of digital archives of rock images, the application of methods supporting the work of experts seems to be fully justified. Prerequisites for the system that may be created based on the methodology proposed in the work should be introduced:

the user has no knowledge of the searched image archive;
the data in the archive are not described, they only have their IDs given during the algorithm operation;
the user can indicate the search key in the form of an image (or fragment), and the search is based only on the analysis of the features of the selected query and the available archive.

The adoption of such assumptions means that the reverse search method based on the analysis of the shapes of objects recorded in the pictures should be based on universal steps that would result in acceptable effectiveness for images of various rocks. On this basis, the main stages of the method can be distinguished:

1.
feature detection stage, i.e. automatic processing of the query key and images from the database to extract the characteristic features, including:
1. a.
  extraction of values describing selected features both for the query image and for the entire available image database - this stage is described in detail in this work;
2. b.
  creating a common feature vector that includes query feature values and an image database. As part of the work, its normalization is also carried out;
2.
the stage of determining similarity between the values of the described features in order to determine the search results.

In this paper, the authors present a stepwise description of the unsupervised reverse search methodology, placing particular emphasis on the description and results of the object detection method that suggest similar features of rock granulation.

2 Method of Conglomerates Detection

In order to quantify the groups of visually similar grains (called here conglomerates), it is necessary to segment the rock image. The research proposed a detection method consisting of four main steps, i.e.

1.
selection of data for analysis;
2.
image generalization - that is, determining the number of dominant colors, and then dividing the image into areas coherent in terms of color;
3.
estimation of the number of color clusters of detected regions (in step No. 2);
4.
segmentation of the input image (from step 1) by color clustering, according to the cluster number values determined in step 3; final image preparation.

2.1 Selection of Data for Analysis

The research used images of 8 different groups of rocks. Rocks were selected in such a way that they differ in the grain size (i.e. granulation). The study analyzed three groups of grain sizes: fine-grained: dolomite and quartzite, medium-grained: crystalline slate, metamorphic shale, limestone and coarse-grained: anhydrite, granite and granodiorite (see Fig. 1).

The images were recorded using an optical microscope with polarized light, with optimal lighting and 100x magnification, which did not change during the recording of all photos. The image database on which the analyzes were carried out comprised 800 digital images, i.e. 100 images for each rock.

The method described in the work was developed and tested in the RGB color space, where each layer is a gray image with values of gray levels and in the range from 0 to 255. In the first step, it is recommended to set a temporary parameter for the maximum number of colors taken into account during further analysis. It can be a division of the entire color space into 256 clusters, referring to the number of gray levels of the image. Choosing larger values allows for more detailed results but it takes more time to run the algorithm.

2.2 Determining the Areas of the Image Dominating in Terms of Color

The next stage of the method is the initial generalization of the image by dividing it into fragments with possibly consistent color representation. It can be done through:

manual selection - a very large fixed number of areas can be assumed that can be generated by the growth segmentation method or, for example, the method of super pixel detection [24];
adaptive selection - by calculating the number of real dominant colors in the image. For this purpose, the number of unique colors is selected that most often occur in the image. Additionally, the minimum number of occurrences of the detected color is checked. The more dominant colors there are, the smaller their area of occurrence. For images with different resolutions and sizes, normalization should be carried out.

Knowing the initial number of dominant colors, one can go to the generalization stage. The research used the SLIC image super pixel detection method [24]. Such generalization results in the creation of the so-called image mosaics, as shown in Fig. 2. It is worth noting that the human eye distinguishes only a few colors on the obtained mosaics, while the histograms indicate the existence of many close but separated clusters of colors with similar levels of gray. The use of adaptive binarization results in the extraction of many invalid conglomerates. This can be seen on the example of granite in Figs. 3 and 4.

The issue of excess objects caused by the closeness of color is the subject of the next step of the proposed method (step 3). The goal of this step is to answer the question of how to binarize obtained mosaics properly so that the fragments of images with visually similar and numerically different colors become one final cluster (conglomerate).

2.3 Estimation of the Number of Color Clusters of Detected Image Regions

It is suggested that in order to binarize mosaics, they should be clustered with respect to the number of major dominant colors. To determine this number automatically, it is proposed to use the method of estimating the number of clusters. The feature vector for this grouping were the colors of the image after generalization. The Elbow method was used, which analyzes the percentage of stability (variance) within a given cluster. It requires the input of the condition of stabilization of variance. It says at what degree of variance of parameter values, the resulting number of clusters is considered optimal. By submitting the image color vector after generalization as a method parameter, one can observe a graph of variance stabilization (y axis) and the number of clusters (x axis) indicating the number of dominant colors in the mosaic (Fig. 5). In all cases, stabilization within clusters is achieved for values higher than 0,9, but it is not constant for all rocks.

In order to determine the value of the optimal number of clusters, one can proceed in various ways, e.g.:

the interesting stabilization value can be assumed a priori to be at a constant level for all rocks, e.g. 0,95;
the last significant change in differences between increasing the value of cluster stabilization can be detected. It comes down to the detection of the so-called elbow (factor) on the Elbow method chart. Analyzing the values of subsequent differences, it can be seen that if for each iteration a factor indicates large changes, then in the analyzed cases it obtained values lower than 0,99. Therefore, it can be assumed that the values of this coefficient lower than 0,99 indicate disproportionate stabilization of variance and the need to increase the number of iterations. Values equal to 0,99 or higher indicate stabilization and the stop condition. Thus, the maximum value and for the stabilization condition may indicate a given number of clusters.

2.4 Input Image Segmentation

Having obtained the number of main dominant colors in previous step, it is proposed to use it as the number of clusters for the image segmentation method. One can then specify the number of result groups that the image should be divided into so that each of the pixels belonging to the image is combined into homogeneous clusters of gray levels (ultimately colors) - see Fig. 6 and 7. Both original images and mosaics can be clustered.

Figure 8 presents the detected objects (grain conglomerates) for sample images for each of the 8 rocks studied. Input images were clustered (so as not to lose information omitted by the generalization method). A morphological gradient by erosion was used to detect the conglomerate boundary.

2.5 Final Algorithm of Conglomerates Detection

The methods described in this chapter lead to segmentation of groups of visually similar grains - called conglomerates. Figure 9 is a graphical presentation of the proposed algorithm of conglomerate segmentation. It should be emphasized that the literature contains descriptions of algorithms for the segmentation of only certain types of rocks. However, these methods work very well for those rocks for which they were developed, and for others they usually fail. The method we proposed was developed to properly segment grains or grain groups for most grained rocks. This method does not always lead to very correct segmentation of grains. However, in our opinion, it is sufficient for the correct classification of rocks based on similar structural features.

3 Results for Similarity Determination Stage

Binary images with detected conglomerates can become the basis for the creation of feature vectors. Each object detected in the image can be described by a set of parameters. In this research the objects parameters were used: surface area, circular coefficient, longest/shortest diameter, equivalent diameter, Feret diameters. As a result, the feature vector of each object in the defined feature space is obtained. One can stop at this approach when the system aims to find all similar objects in the database.

However, if it is necessary to find images of rocks similar in shape and mutual quantitative relationships, then statistical measures describing the features of the objects can be used. In this work, the following were analyzed:

skewness of the feature value for all objects within the image; it determines the asymmetry of the distribution of the analyzed variable;
coefficient of variation of the feature value for all objects within the image; it defines the degree of the variation in the distribution of features;
spread of the feature value for all objects within the image; determines the quantitative diversity between objects in the image;
arithmetic mean of the feature values for all objects within the image; determines the average quantitative measure between objects in the image;

In addition, the parameters of mean, coefficient of variation, dispersion and skewness of the orientation of objects in the image were introduced. The number of objects and the ratio of conglomerate surface area coverage to their number was also taken into account. In this way, a vector consisting of 42 features (X42) was defined. Additionally, the usefulness of different variants of the 42 dimensional subspace was assessed:

X2 - descriptor described by two features (number of objects in the image - the ratio of conglomerate surface area coverage to their number),
X10m - a descriptor based only on average values,
X10cv - descriptor based only on bottom coefficients of variation,
X10r - a descriptor based only on dispersion of values,
X10s - a descriptor based on the skewness of parameters values.

The average results of the geological correctness of detection of kNN for various variants are presented in Table 1. It can be seen that the results deteriorate with the increase in the number of analyzed neighbors k. The best results, on average for all rock groups, were obtained for the X10m space. The lowest results were obtained for searches based on the analysis of only the number of conglomerates and the ratio of the surface area of conglomerates and their number.

Table 1. Comparison of the average (for all groups of images) values of the geological fit for different numbers of the most similar images (kNN method) for different feature spaces [%].

Full size table

Table 2 presents a summary of the average value of correct geological fit for individual groups of images. It can be noted that, as in the case of Table 1, the highest results are obtained for descriptors of average values of shape parameters and based on the entire unreduced X42 space.

Table 2. Comparison of the average (for all groups of images) values of the geological fit for ten the most similar images (kNN method) for different feature spaces [%].

Full size table

Examples of matching results are shown in Figs. 10 and 11. Searching with a key in the form of an image of a rock with a coarse or fine-grained structure results in obtaining the most similar images with such a structure. Thus, it seems possible that image search would make it possible to determine the structural similarity of images of different rocks.

4 Recapitulation

The paper presents a method of detecting conglomerates on microscopic images of rocks that allows searching for images of rocks with similar structural features. These features are understood as the size and shape of its components and the relationship between them. The proposed detection methodology is an unsupervised and adaptive method. It analyzes the number of characteristic color clusters on the examined images. The method returns good results for rocks with clear clusters of colorful rock-forming objects. It is not sensitive to individual small color charges (e.g. minerals stuck in the binder, e.g. clay) and treats them as noise, which it skips at the stage of estimating the number of the most characteristic clusters. It seems that the method cannot be a segmentation method used to accurately describe the rock, because it does not produce results accurate enough to become the basis for quantitative analysis of the rock. However, this method is a good starting point for automatic analysis and interpretation of the content of petrographic images. In the authors’ opinion, this method can be successfully used for rock image retrieval. It can also be used in image sorting for given geological features, when the input data is unknown, not described, and thus training the system and the use of supervised classification methods (e.g. for segmenting rock grains) is impossible.

References

Zhang, Y., Wang, G., Li, M., Han, S.: Automated classification analysis of geological structures based on images data and deep learning model. Appl. Sci. 8(12), 2493 (2018)
Article Google Scholar
Aligholi, S., Khajavi, R., Razmara, M.: Automated mineral identification algorithm using optical properties of crystals. Comput. Geosci. 85, 175–183 (2015)
Article Google Scholar
Shu, L., McIsaac, K., Osinski, G.R., Francis, R.: Unsupervised feature learning for autonomous rock image classification. Comput. Geosci. 106, 10–17 (2017)
Article Google Scholar
Izadi, H., Sadri, J., Bayati, M.: An intelligent system for mineral identification in thin sections based on a cascade approach. Comput. Geosci. 99, 37–49 (2017)
Article Google Scholar
Młynarczuk, M., Habrat, M., Skoczylas, N.: The application of the automatic search for visually similar geological layers in a borehole in introscopic camera recordings. Measurement 85, 142–151 (2016)
Article Google Scholar
Espinoza-Molina, D., Datcu, M.: Earth-observation image retrieval based on content, semantics, and metadata. IEEE Trans. Geosci. Remote Sens. 51(11), 5145–5159 (2013)
Article Google Scholar
Castelli, V., Bergman, L.D.: Image Databases: Search and Retrieval of Digital Imagery. Wiley, New York (2004)
Google Scholar
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 2, pp. 1816–1823 (2005)
Google Scholar
Ładniak, M., Młynarczuk, M.: Search of visually similar microscopic rock images. Comput. Geosci. 19(1), 127–136 (2014). https://doi.org/10.1007/s10596-014-9459-2
Article Google Scholar
Liu, Y., Zhang, D., Lu, G., Ma, W.Y.: A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40(1), 262–282 (2007)
Article Google Scholar
Najgebauer, P., et al.: Fast dictionary matching for content-based image retrieval. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds.) ICAISC 2015. LNCS (LNAI), vol. 9119, pp. 747–756. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19324-3_67
Chapter Google Scholar
Habrat, M., Młynarczuk, M.: Evaluation of local matching methods in image analysis for mineral grain tracking in microscope images of rock sections. Minerals 8, 182 (2018)
Article Google Scholar
Gaillard, M., Egyed-Zsigmond, E.: Large scale reverse image search. In: XXXVème Congrès INFORSID, p. 127 (2017)
Google Scholar
Habrat, M., Młynarczuk, M.: Object retrieval in microscopic images of rocks using the query by sketch method. Appl. Sci. 10, 278 (2020)
Article Google Scholar
Wang, X.J., Zhang, L., Li, X., Ma, W.Y.: Annotating images by mining image search results. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1919–1932 (2008)
Article Google Scholar
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV 2003), vol. 2, pp. 1470–1477 (2003)
Google Scholar
Aigrain, P., Zhang, H., Petkovic, D.: Content-based representation and retrieval of visual media: a state-of-the-art review. In: Zhang, H., Aigrain, P., Petkovic, D. (eds.) Representation and Retrieval of Visual Media in Multimedia Systems, pp. 3–26. Springer, Boston (1996). https://doi.org/10.1007/978-0-585-34549-9_2
Chapter Google Scholar
Chen, Y., Wang, J.Z., Krovetz, R.: An unsupervised learning approach to content-based image retrieval. In: Seventh International Symposium on Signal Processing and Its Applications, vol. 1, pp. 197–200. IEEE (2003)
Google Scholar
Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)
MathSciNet MATH Google Scholar
Zakariya, S., Ali, R., Ahmad, N.: Combining visual features of an image at different precision value of unsupervised content based image retrieval. In: IEEE International Conference on Computational Intelligence and Computing Research, ICCIC 2010, pp. 1–4. IEEE (2010)
Google Scholar
Minka, T.P., Picard, R.W.: Interactive learning with a “society of models”. Pattern Recogn. 30, 565–581 (1997)
Article Google Scholar
Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: a power tool for interactive contentbased image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8(5), 644–655 (1998)
Article Google Scholar
Zhou, X.S., Huang, T.S.: Relevance feedback in image retrieval: a comprehensive review. Multimedia Syst. 8(6), 536–544 (2003)
Article Google Scholar
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
Article Google Scholar

Download references

Acknowledgements

This work was financed by the AGH-University of Science and Technology, Faculty of Geology, Geophysics and Environmental Protection as a part of statutory project.

Author information

Authors and Affiliations

AGH University of Science and Technology, Mickiewicza 30, 30-059, Kraków, Poland
Magdalena Habrat & Mariusz Młynarczuk

Authors

Magdalena Habrat
View author publications
You can also search for this author in PubMed Google Scholar
Mariusz Młynarczuk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mariusz Młynarczuk .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Amsterdam, Amsterdam, The Netherlands
Gábor Závodszky
University of Amsterdam, Amsterdam, The Netherlands
Michael H. Lees
University of Tennessee, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot
Intellegibilis, Setúbal, Portugal
Sérgio Brissos
Intellegibilis, Setúbal, Portugal
João Teixeira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Habrat, M., Młynarczuk, M. (2020). Granulation-Based Reverse Image Retrieval for Microscopic Rock Images. In: Krzhizhanovskaya, V.V., et al. Computational Science – ICCS 2020. ICCS 2020. Lecture Notes in Computer Science(), vol 12139. Springer, Cham. https://doi.org/10.1007/978-3-030-50420-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-50420-5_6
Published: 15 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50419-9
Online ISBN: 978-3-030-50420-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us