Abstract
Despite their popularity, approaches based on salient point descriptors have yet to be proven effective for content-based image retrieval. In this paper, we show how the Windsurf library can be effectively exploited to assess a fair comparison among the existing alternative approaches based on salient points, which can be contrasted on aspects of both effectiveness and efficiency. Our extensive experimental evaluation, performed on four different image benchmarks, indeed, shows that techniques based on salient point descriptors have effectiveness not better than other existing techniques and are less amenable to be indexed, and thus, their efficiency remains questionable.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
http://en.wikipedia.org/wiki/Content-based_image_retrieval, retrieved on 2017/10/11 12:52:11.
The Windsurf library is written in Java and is released under the “QPL” license, being freely available at URI http://www-db.disi.unibo.it/Windsurf/ for education and research purposes only.
We thank Tim Flach (http://timflach.com) for the original image.
We also experimented with selecting the points with largest size, but results were significantly worse than the described alternative.
Performing experiments on a low-end machine was possible thanks to the very limited memory requirements of the Windsurf framework. For example, for solving a k-NN query, the M-tree only requires a single tree node to be in main memory, while the largest data structure is the priority queue containing ids of nodes yet to be accessed, whose maximum size is M integers, with M denoting the number of tree nodes.
In Appendix B, we show how it is possible to derive the distance distribution of a Hausdorff metric given the distribution of its ground distance \(\delta\).
References
Alahi, A., Ortiz, R., Vandergheynst, P.: FREAK: fast retina keypoint, pp. 510–517. CVPR ’12, Providence (2012)
Amato, G., Falchi, F.: kNN based image classification relying on local feature similarity, pp. 101–108. SISAP ’10, Istanbul (2010)
Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013)
Ardizzoni, S., Bartolini, I., Patella, M.: Windsurf: region-based image retrieval using wavelets, pp. 167–173. IWOSS 1999, Florence (1999)
Bartolini, I., Ciaccia, P.: Imagination: exploiting link analysis for accurate image annotation, pp. 32–44. AMR 2007, Paris (2007)
Bartolini, I., Ciaccia, P., Patella, M.: Adaptively browsing image databases with PIBE. Multimed. Tools Appl. 31(3), 269–286 (2006)
Bartolini, I., Ciaccia, P., Patella, M.: Query processing issues in region-based image databases. Knowl. Inform. Syst. 25(2), 389–420 (2010)
Bartolini, I., Patella, M., Stromei, G.: The Windsurf library for the efficient retrieval of multimedia hierarchical data, pp. 139–148. SIGMAP 2011, Seville (2011)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features, pp. 404–417. ECCV 2006, Graz (2006)
Beecks, C., Uysal, M.S., Seidl, T.: A comparative study of similarity measures for content-based multimedia retrieval, pp. 1552–1557. ICME 2010, Suntec (2010)
Bekele, D., Teutsch, M., Schuchert, T.: Evaluation of binary keypoint descriptors, pp. 3652–3656. ICIP 2013, Melbourne (2013)
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful?, pp. 217–235. ICDT 1999, Jerusalem (1999)
Börzsonyi, S., Kossmann, D., Stocker, K.: The Skyline operator, pp. 421–430. ICDE 2001, Heidelberg (2001)
Chávez, E., Navarro, G., Baeza-Yates, R.A., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces, pp. 426–435. VLDB 1997, Athens (1997)
Ciaccia, P., Patella, M., Zezula, P.: A cost model for similarity queries in metric spaces, pp. 59–68. PODS 1998, Seattle (1998)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5 (2008)
Deselaers, T., Keysers, D., Ney, H.: Classification error rate for quantitative evaluation of content-based image retrieval systems, pp. 505–508. ICPR 2004, Cambridge (2004)
Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval: an experimental comparison. Inform. Retr. 11(2), 77–107 (2008)
Ghosh, N., Rimoldi, O.E., Beanlands, R.S., Camici, P.G.: Assessment of myocardial ischaemia and viability: role of positron emission tomography. Eur. Heart J. 31(24), 2984–2995 (2010)
Haralick, R., Shapiro, L.: Computer and robot vision II. Addison-Wesley, Reading (1993)
Hayashida, M., Koyano, H., Akutsu, T.: Measuring the similarity of protein structures using image local feature descriptors SIFT and SURF, pp. 164–168. ISB ’14, Qingdao (2014)
Heim, A.M.: Identification and bridging of semantic gaps in the context of multi-domain engineering, pp. 57–58. fPET-2010, Golden (2010)
Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4), 11 (2008)
Kailath, T.: The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans. Commun. Technol. 15(1), 52–60 (1967)
Kalantidis, Y., Tolias, G., Avrithis, Y., Phinikettos, M., Spyrou, E., Mylonas, P., Kollias, S.: VIRaL: visual image retrieval and localization. Multimed. Tools Appl. 51(2), 555–592 (2011)
Kim, M.U., Yoon, K.: Performance evaluation of large-scale object recognition system using bag of visual words model. Multimed. Tools Appl. 74(7), 2499–2517 (2015)
Leutenegger, S., Chli, M., Siegwart, R.Y.: BRISK: binary robust invariant scalable keypoints, pp. 2548–2555. ICCV ’11, Washington (2011)
Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state of the art and challenges. ACM Trans. Multimed. Comput. Commun. Appl. 2(1), 1–19 (2006)
Ling, H., Okada, K.: An efficient earth mover’s distance algorithm for robust histogram comparison. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 840–853 (2007)
Lee, Y.-H., Kim, Y.: Efficient image retrieval using advanced SURF and DCD on mobile platform. Multimed. Tools Appl. 74(7), 2289–2299 (2015)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Müller, H., Müller, W., Squire, D.McG., Marchand-Maillet, S., Pun, T.: Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recognit. Lett. 22(5), 593–601 (2001)
Narasimhalu, A.D., Kankanhalli, M.S., Wu, J.: Benchmarking multimedia databases. Multimed. Tools Appl. 4(3), 333–356 (1992)
Penatti, O.A., Valle, E., Torres, R.D.S.: Comparative study of global color and texture descriptors for web image retrieval. J. Vis. Commun. Image Represent. 23(2), 359–380 (2012)
Rahmani, R., Goldman, S.A., Zhang, H., Cholleti, S.R., Fritts, J.E.: Localized content-based image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1902–1912 (2008)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF, pp. 2564–2571. ICCV ’11, Washington (2011)
Rubner, Y., Tomasi, C.: Perceptual metrics for image database navigation. Kluwer, Boston (2000)
Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000)
Thomee, B., Shamma, D., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., Li, L.-J.: YFCC100M: the new data in multimedia research. Commun. ACM 59(2), 64–73 (2016)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity search—the metric space approach. Springer, Berlin (2006)
Zhou, W., Li, H., Lu, Y., Wang, M., Tian, Q.: Visual word expansion and BSIFT verification for large-scale image search. Multimed. Syst. 21(3), 245–254 (2015)
Acknowledgements
The authors thank Emanuele Dall’Ospedale (for providing the code for salient point management and experiments on Windsurf, SIFT, and SURF), Pietro Pascarella (for the experiments on ORB, BRISK, and FREAK), and Paolo Ciaccia (for the discussions about the model presented in Sect. 3.2).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by T. Plagemann.
Appendices
Appendix A: The Windsurf system
This appendix provides a brief description of Windsurf; additional details can be found in [4]. Windsurf fragments each image (document) into a set of regions (elements) whose pixels are similar in color and texture. To this end, each image is analyzed using (the third level of the LL sub-band of) a 2-D Haar Discrete Wavelet Transform (DWT) in the HSV color space. The K-means clustering algorithm is then used to group wavelet coefficients (these are compared using their Mahalanobis distance). To discover the “right” number of regions for each image, the K-means algorithm is iterated with \(K \in [2,10]\), and the “optimal” K value is chosen as the one that minimizes a validity function. Each of the so-obtained clusters corresponds to an image region \(R_i\), that is represented, for each of the 4 sub-bands of the third level DWT, by the centroid and the covariance matrix of the wavelet coefficients in the corresponding cluster: for band \(B\), these are denoted, respectively, as \(\mu _{R_i}^B\) and \(\mathbf {C}_{R_i}^{B}\). Intuitively, the centroid summarizes color information, while the covariance matrix captures the texture of pixels in the cluster.
To compare two regions (elements) \(R_i\) and \(R_j\), the element distance function \(\delta\) takes into account differences between wavelet descriptors over the four frequency sub-bands:
The distance \(\delta _{B} \left( R_i,R_j \right)\) on each band \(B\) is computed as the Bhattacharyya metric coefficient [25]:
where \(\rho _{B}\left( R_i,R_j \right)\) (the Bhattacharyya distance) is expressed as follows:
in which \(|\mathbf {A}|\) denotes the determinant of matrix \(\mathbf {A}\). The definition of \(\rho _{B}\left( R_i,R_j \right)\) consists of two terms: the first one is used to compare the covariance matrices of the two regions, while the second term is the Mahalanobis distance between regions centroids, where an average covariance matrix is used. Note that if the regions have the same centroid, the second term vanishes, and thus, the first term measures how similar the two 3-D ellipsoids are (this is the case of regions with similar colors but different texture).
We finally note that the element distance \(\delta\) defined in Eq. 6 is a metric, since the Bhattacharyya metric coefficient (Eq. 7) is a metric [25] and the square root of a (weighted) sum of squared metrics is also a metric.
Appendix B: Distance distribution of the Hausdorff metric
This appendix shows how it is possible to compute the distance distribution for the Hausdorff metric \({d}_H\), given the distance distribution for the ground distance \(\delta\) (see Eq. 1).
Formally, given a distance function \(\delta\) over a domain \(\Omega\), its distance distribution F(x) equals the probability that the distance between two random points \(\mathbf {p}\) and \(\mathbf {q}\) drawn from \(\Omega\) is not higher than x, \(F(x) = \Pr \{ \delta (\mathbf {p}, \mathbf {q}) \le x \}\) [16].
Given the ground distance \(\delta\) with its distance distribution F(x), we now want to compute the distance distribution G(x) of \({d}_H\), supposing that the two sets, \(\mathbf {S}_1\) and \(\mathbf {S}_2\), contain random points drawn from \(\Omega\) with the same distribution (so that the distance between any point of \(\mathbf {S}_1\) and any point of \(\mathbf {S}_2\) is distributed according to F(x)).
We first assume that the two sets have different cardinality, \(N_1\) and \(N_2\). Given any point \(\mathbf {p}_i\) in \(\mathbf {S}_1\), the distribution of distance to its NN in \(\mathbf {S}_2\) can be computed as follows [16]:
The (asymmetric) distance from \(\mathbf {S}_1\) to \(\mathbf {S}_2\) is computed as the maximum of NN distances for all points in \(\mathbf {S}_1\), whose distribution is:
The same can be computed for the opposite distance obtaining:
which, for the case \(N_1=N_2\), becomes:
Rights and permissions
About this article
Cite this article
Bartolini, I., Patella, M. Windsurf: the best way to SURF. Multimedia Systems 24, 459–476 (2018). https://doi.org/10.1007/s00530-017-0567-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-017-0567-4