Skip to main content
Log in

Windsurf: the best way to SURF

(and SIFT/BRISK/ORB/FREAK, too)

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Despite their popularity, approaches based on salient point descriptors have yet to be proven effective for content-based image retrieval. In this paper, we show how the Windsurf library can be effectively exploited to assess a fair comparison among the existing alternative approaches based on salient points, which can be contrasted on aspects of both effectiveness and efficiency. Our extensive experimental evaluation, performed on four different image benchmarks, indeed, shows that techniques based on salient point descriptors have effectiveness not better than other existing techniques and are less amenable to be indexed, and thus, their efficiency remains questionable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. http://en.wikipedia.org/wiki/Content-based_image_retrieval, retrieved on 2017/10/11 12:52:11.

  2. http://images.google.com.

  3. http://www.imageclef.org.

  4. The Windsurf library is written in Java and is released under the “QPL” license, being freely available at URI http://www-db.disi.unibo.it/Windsurf/ for education and research purposes only.

  5. http://www.mysql.com.

  6. We thank Tim Flach (http://timflach.com) for the original image.

  7. http://opencv.org.

  8. We also experimented with selecting the points with largest size, but results were significantly worse than the described alternative.

  9. Performing experiments on a low-end machine was possible thanks to the very limited memory requirements of the Windsurf framework. For example, for solving a k-NN query, the M-tree only requires a single tree node to be in main memory, while the largest data structure is the priority queue containing ids of nodes yet to be accessed, whose maximum size is M integers, with M denoting the number of tree nodes.

  10. In Appendix B, we show how it is possible to derive the distance distribution of a Hausdorff metric given the distribution of its ground distance \(\delta\).

  11. http://www.yfcc100m.org.

References

  1. Alahi, A., Ortiz, R., Vandergheynst, P.: FREAK: fast retina keypoint, pp. 510–517. CVPR ’12, Providence (2012)

    Google Scholar 

  2. Amato, G., Falchi, F.: kNN based image classification relying on local feature similarity, pp. 101–108. SISAP ’10, Istanbul (2010)

    Google Scholar 

  3. Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  4. Ardizzoni, S., Bartolini, I., Patella, M.: Windsurf: region-based image retrieval using wavelets, pp. 167–173. IWOSS 1999, Florence (1999)

    Google Scholar 

  5. Bartolini, I., Ciaccia, P.: Imagination: exploiting link analysis for accurate image annotation, pp. 32–44. AMR 2007, Paris (2007)

    Google Scholar 

  6. Bartolini, I., Ciaccia, P., Patella, M.: Adaptively browsing image databases with PIBE. Multimed. Tools Appl. 31(3), 269–286 (2006)

    Article  Google Scholar 

  7. Bartolini, I., Ciaccia, P., Patella, M.: Query processing issues in region-based image databases. Knowl. Inform. Syst. 25(2), 389–420 (2010)

    Article  Google Scholar 

  8. Bartolini, I., Patella, M., Stromei, G.: The Windsurf library for the efficient retrieval of multimedia hierarchical data, pp. 139–148. SIGMAP 2011, Seville (2011)

    Google Scholar 

  9. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features, pp. 404–417. ECCV 2006, Graz (2006)

    Google Scholar 

  10. Beecks, C., Uysal, M.S., Seidl, T.: A comparative study of similarity measures for content-based multimedia retrieval, pp. 1552–1557. ICME 2010, Suntec (2010)

    Google Scholar 

  11. Bekele, D., Teutsch, M., Schuchert, T.: Evaluation of binary keypoint descriptors, pp. 3652–3656. ICIP 2013, Melbourne (2013)

    Google Scholar 

  12. Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful?, pp. 217–235. ICDT 1999, Jerusalem (1999)

    Google Scholar 

  13. Börzsonyi, S., Kossmann, D., Stocker, K.: The Skyline operator, pp. 421–430. ICDE 2001, Heidelberg (2001)

    Google Scholar 

  14. Chávez, E., Navarro, G., Baeza-Yates, R.A., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)

    Article  Google Scholar 

  15. Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces, pp. 426–435. VLDB 1997, Athens (1997)

    Google Scholar 

  16. Ciaccia, P., Patella, M., Zezula, P.: A cost model for similarity queries in metric spaces, pp. 59–68. PODS 1998, Seattle (1998)

    Google Scholar 

  17. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5 (2008)

    Article  Google Scholar 

  18. Deselaers, T., Keysers, D., Ney, H.: Classification error rate for quantitative evaluation of content-based image retrieval systems, pp. 505–508. ICPR 2004, Cambridge (2004)

    Google Scholar 

  19. Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval: an experimental comparison. Inform. Retr. 11(2), 77–107 (2008)

    Article  Google Scholar 

  20. Ghosh, N., Rimoldi, O.E., Beanlands, R.S., Camici, P.G.: Assessment of myocardial ischaemia and viability: role of positron emission tomography. Eur. Heart J. 31(24), 2984–2995 (2010)

    Article  Google Scholar 

  21. Haralick, R., Shapiro, L.: Computer and robot vision II. Addison-Wesley, Reading (1993)

    Google Scholar 

  22. Hayashida, M., Koyano, H., Akutsu, T.: Measuring the similarity of protein structures using image local feature descriptors SIFT and SURF, pp. 164–168. ISB ’14, Qingdao (2014)

    Google Scholar 

  23. Heim, A.M.: Identification and bridging of semantic gaps in the context of multi-domain engineering, pp. 57–58. fPET-2010, Golden (2010)

    Google Scholar 

  24. Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4), 11 (2008)

    Article  Google Scholar 

  25. Kailath, T.: The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans. Commun. Technol. 15(1), 52–60 (1967)

    Article  Google Scholar 

  26. Kalantidis, Y., Tolias, G., Avrithis, Y., Phinikettos, M., Spyrou, E., Mylonas, P., Kollias, S.: VIRaL: visual image retrieval and localization. Multimed. Tools Appl. 51(2), 555–592 (2011)

    Article  Google Scholar 

  27. Kim, M.U., Yoon, K.: Performance evaluation of large-scale object recognition system using bag of visual words model. Multimed. Tools Appl. 74(7), 2499–2517 (2015)

    Article  Google Scholar 

  28. Leutenegger, S., Chli, M., Siegwart, R.Y.: BRISK: binary robust invariant scalable keypoints, pp. 2548–2555. ICCV ’11, Washington (2011)

    Google Scholar 

  29. Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state of the art and challenges. ACM Trans. Multimed. Comput. Commun. Appl. 2(1), 1–19 (2006)

    Article  Google Scholar 

  30. Ling, H., Okada, K.: An efficient earth mover’s distance algorithm for robust histogram comparison. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 840–853 (2007)

    Article  Google Scholar 

  31. Lee, Y.-H., Kim, Y.: Efficient image retrieval using advanced SURF and DCD on mobile platform. Multimed. Tools Appl. 74(7), 2289–2299 (2015)

    Article  Google Scholar 

  32. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  33. Müller, H., Müller, W., Squire, D.McG., Marchand-Maillet, S., Pun, T.: Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recognit. Lett. 22(5), 593–601 (2001)

    Article  MATH  Google Scholar 

  34. Narasimhalu, A.D., Kankanhalli, M.S., Wu, J.: Benchmarking multimedia databases. Multimed. Tools Appl. 4(3), 333–356 (1992)

    Article  Google Scholar 

  35. Penatti, O.A., Valle, E., Torres, R.D.S.: Comparative study of global color and texture descriptors for web image retrieval. J. Vis. Commun. Image Represent. 23(2), 359–380 (2012)

    Article  Google Scholar 

  36. Rahmani, R., Goldman, S.A., Zhang, H., Cholleti, S.R., Fritts, J.E.: Localized content-based image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1902–1912 (2008)

    Article  Google Scholar 

  37. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF, pp. 2564–2571. ICCV ’11, Washington (2011)

    Google Scholar 

  38. Rubner, Y., Tomasi, C.: Perceptual metrics for image database navigation. Kluwer, Boston (2000)

    MATH  Google Scholar 

  39. Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000)

    Article  Google Scholar 

  40. Thomee, B., Shamma, D., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., Li, L.-J.: YFCC100M: the new data in multimedia research. Commun. ACM 59(2), 64–73 (2016)

    Article  Google Scholar 

  41. Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity search—the metric space approach. Springer, Berlin (2006)

    MATH  Google Scholar 

  42. Zhou, W., Li, H., Lu, Y., Wang, M., Tian, Q.: Visual word expansion and BSIFT verification for large-scale image search. Multimed. Syst. 21(3), 245–254 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank Emanuele Dall’Ospedale (for providing the code for salient point management and experiments on Windsurf, SIFT, and SURF), Pietro Pascarella (for the experiments on ORB, BRISK, and FREAK), and Paolo Ciaccia (for the discussions about the model presented in Sect. 3.2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Patella.

Additional information

Communicated by T. Plagemann.

Appendices

Appendix A: The Windsurf system

This appendix provides a brief description of Windsurf; additional details can be found in [4]. Windsurf fragments each image (document) into a set of regions (elements) whose pixels are similar in color and texture. To this end, each image is analyzed using (the third level of the LL sub-band of) a 2-D Haar Discrete Wavelet Transform (DWT) in the HSV color space. The K-means clustering algorithm is then used to group wavelet coefficients (these are compared using their Mahalanobis distance). To discover the “right” number of regions for each image, the K-means algorithm is iterated with \(K \in [2,10]\), and the “optimal” K value is chosen as the one that minimizes a validity function. Each of the so-obtained clusters corresponds to an image region \(R_i\), that is represented, for each of the 4 sub-bands of the third level DWT, by the centroid and the covariance matrix of the wavelet coefficients in the corresponding cluster: for band \(B\), these are denoted, respectively, as \(\mu _{R_i}^B\) and \(\mathbf {C}_{R_i}^{B}\). Intuitively, the centroid summarizes color information, while the covariance matrix captures the texture of pixels in the cluster.

To compare two regions (elements) \(R_i\) and \(R_j\), the element distance function \(\delta\) takes into account differences between wavelet descriptors over the four frequency sub-bands:

$$\begin{aligned} \delta \left( R_i,R_j \right) = \sqrt{\frac{1}{4} \sum _{B=1}^4 \delta _{B} \left( R_i,R_j \right) ^2}. \end{aligned}$$
(6)

The distance \(\delta _{B} \left( R_i,R_j \right)\) on each band \(B\) is computed as the Bhattacharyya metric coefficient [25]:

$$\begin{aligned} \delta _{B} \left( R_i,R_j \right) ^2 = 1 - e^{-\rho _{B}\left( R_i,R_j \right) }, \end{aligned}$$
(7)

where \(\rho _{B}\left( R_i,R_j \right)\) (the Bhattacharyya distance) is expressed as follows:

$$\rho _{B}\left( R_i,R_j \right) = \frac{1}{2} \ln \left( \frac{ \left| \frac{\mathbf {C}_{R_i}^{B} + \mathbf {C}_{R_j}^{B}}{2} \right| }{\left| \mathbf {C}_{R_i}^{B} \right| ^{\frac{1}{2}}\cdot \left| \mathbf {C}_{R_j}^{B} \right| ^{\frac{1}{2}}}\right)+\frac{1}{8} \left[ \left( \mu _{R_i}^B- \mu _{R_j}^B\right) ^{T} \left( \frac{\mathbf {C}_{R_i}^{B} + \mathbf {C}_{R_j}^{B}}{2} \right) ^{-1} \left( \mu _{R_i}^B- \mu _{R_j}^B\right) \right]$$
(8)

in which \(|\mathbf {A}|\) denotes the determinant of matrix \(\mathbf {A}\). The definition of \(\rho _{B}\left( R_i,R_j \right)\) consists of two terms: the first one is used to compare the covariance matrices of the two regions, while the second term is the Mahalanobis distance between regions centroids, where an average covariance matrix is used. Note that if the regions have the same centroid, the second term vanishes, and thus, the first term measures how similar the two 3-D ellipsoids are (this is the case of regions with similar colors but different texture).

We finally note that the element distance \(\delta\) defined in Eq. 6 is a metric, since the Bhattacharyya metric coefficient (Eq. 7) is a metric [25] and the square root of a (weighted) sum of squared metrics is also a metric.

Appendix B: Distance distribution of the Hausdorff metric

This appendix shows how it is possible to compute the distance distribution for the Hausdorff metric \({d}_H\), given the distance distribution for the ground distance \(\delta\) (see Eq. 1).

Formally, given a distance function \(\delta\) over a domain \(\Omega\), its distance distribution F(x) equals the probability that the distance between two random points \(\mathbf {p}\) and \(\mathbf {q}\) drawn from \(\Omega\) is not higher than x, \(F(x) = \Pr \{ \delta (\mathbf {p}, \mathbf {q}) \le x \}\) [16].

Given the ground distance \(\delta\) with its distance distribution F(x), we now want to compute the distance distribution G(x) of \({d}_H\), supposing that the two sets, \(\mathbf {S}_1\) and \(\mathbf {S}_2\), contain random points drawn from \(\Omega\) with the same distribution (so that the distance between any point of \(\mathbf {S}_1\) and any point of \(\mathbf {S}_2\) is distributed according to F(x)).

We first assume that the two sets have different cardinality, \(N_1\) and \(N_2\). Given any point \(\mathbf {p}_i\) in \(\mathbf {S}_1\), the distribution of distance to its NN in \(\mathbf {S}_2\) can be computed as follows [16]:

$$\begin{aligned} \Pr \{ \min _j \delta (\mathbf {p}_i, \mathbf {q}_j) \le x \} = 1 - \left( 1 - F(x) \right) ^{N_2}. \end{aligned}$$

The (asymmetric) distance from \(\mathbf {S}_1\) to \(\mathbf {S}_2\) is computed as the maximum of NN distances for all points in \(\mathbf {S}_1\), whose distribution is:

$$\begin{aligned} \Pr \{ \max _i \min _j \delta (\mathbf {p}_i, \mathbf {q}_j) \le x \} = \left( 1 - \left( 1 - F(x) \right) ^{N_2} \right) ^{N_1}. \end{aligned}$$

The same can be computed for the opposite distance obtaining:

$$\Pr \{ {d}_H (\mathbf {S}_1, \mathbf {S}_2) \le x \} = \min\left\{ \left( 1 - \left( 1 - F(x) \right) ^{N_2} \right) ^{N_1}, \left( 1 - \left( 1 - F(x) \right) ^{N_1} \right) ^{N_2} \right\}$$

which, for the case \(N_1=N_2\), becomes:

$$\begin{aligned} \Pr \{ {d}_H (\mathbf {S}_1, \mathbf {S}_2) \le x \} = \left( 1 - \left( 1 - F(x) \right) ^N \right) ^N. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bartolini, I., Patella, M. Windsurf: the best way to SURF. Multimedia Systems 24, 459–476 (2018). https://doi.org/10.1007/s00530-017-0567-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-017-0567-4

Keywords

Navigation