Advertisement

Querying Metric Spaces with Bit Operations

  • Richard Connor
  • Alan DearleEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11223)

Abstract

Metric search techniques can be usefully characterised by the time at which distance calculations are performed during a query. Most exact search mechanisms use a “just-in-time” approach where distances are calculated as part of a navigational strategy. An alternative is to use a“one-time” approach, where distances to a fixed set of reference objects are calculated at the start of each query. These distances are typically used to re-cast data and queries into a different space where querying is more efficient, allowing an approximate solution to be obtained.

In this paper we use a “one-time” approach for an exact search mechanism. A fixed set of reference objects is used to define a large set of regions within the original space, and each query is assessed with respect to the definition of these regions. Data is then accessed if, and only if, it is useful for the calculation of the query solution.

As dimensionality increases, the number of defined regions must increase, but the memory required for the exclusion calculation does not. We show that the technique gives excellent performance over the SISAP benchmark data sets, and most interestingly we show how increases in dimensionality may be countered by relatively modest increases in the number of reference objects used.

Notes

Acknowledgements

This work was supported by ESRC grant ES/L007487/1 “Administrative Data Research Centre—Scotland”. We would like to thank Tom Dalton for his help with preparation of the data and creating R scripts for rendering results, and Peter Christen along with the anonymous reviewers for helpful comments on earlier drafts.

References

  1. 1.
    Amato, G., Gennaro, C., Savino, P.: MI-file: using inverted files for scalable approximate similarity search. Multimed. Tools Appl. 71(3), 1333–1362 (2014)CrossRefGoogle Scholar
  2. 2.
    Andrade, J.M., Astudillo, C.A., Paredes, R.: Metric space searching based on random bisectors and binary fingerprints. In: Traina, A.J.M., Traina, C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 50–57. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11988-5_5CrossRefGoogle Scholar
  3. 3.
    Blumenthal, L.M.: A note on the four-point property. Bull. Am. Math. Soc. 39(6), 423–426 (1933)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)CrossRefGoogle Scholar
  5. 5.
    Connor, R.: Reference point hyperplane trees. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 65–78. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46759-7_5CrossRefGoogle Scholar
  6. 6.
    Connor, R., Cardillo, F.A., Vadicamo, L., Rabitti, F.: Hilbert exclusion: improved metric search through finite isometric embeddings. ACM Trans. Inf. Syst. 35(3), 17:1–17:27 (2016)CrossRefGoogle Scholar
  7. 7.
    Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search with the four-point property. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 51–64. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46759-7_4CrossRefGoogle Scholar
  8. 8.
    Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search. Inf. Syst. (2018).  https://doi.org/10.1016/j.is.2018.01.002
  9. 9.
    Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). http://www.sisap.org
  10. 10.
    Chavez Gonzalez, E., Figueroa, K., Navarro, G., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1647–1658 (2008)CrossRefGoogle Scholar
  11. 11.
    Lokoč, J., Skopal, Y.: On applications of parameterized hyperplane partitioning. In: Proceedings of the Third International Conference on SImilarity Search and APplications, SISAP 2010, pp. 131–132. ACM, New York (2010)Google Scholar
  12. 12.
    Menger, K.: New foundation of Euclidean geometry. Am. J. Math. 53(4), 721–745 (1931)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Mic, V., Novak, D., Zezula, P.: Improving sketches for similarity search. Proc. MEMICS 2015, 45–57 (2015)Google Scholar
  14. 14.
    Micó, M.L., Oncina, J., Vidal, E.: A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recogn. Lett. 15(1), 9–17 (1994)CrossRefGoogle Scholar
  15. 15.
    Mohamed, H., Marchand-Maillet, S.: Quantized ranking for permutation-based indexing. Inf. Syst. 52, 163–175 (2015). Special Issue on Selected Papers from SISAP 2013CrossRefGoogle Scholar
  16. 16.
    Rivero, L.C., Doorn, J.H., Ferraggine, V.E. (eds.): Encyclopedia of Database Technologies and Applications. Idea Group, Hershey (2005)Google Scholar
  17. 17.
    Silva, E., Teixeira, T., Teodoro, G., Valle, E.: Large-scale distributed locality-sensitive hashing for general metric data. In: Traina, A.J.M., Traina, C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 82–93. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11988-5_8CrossRefGoogle Scholar
  18. 18.
    Tellez, E.S., Chavez, E.: On locality sensitive hashing in metric spaces. In: Proceedings of the Third International Conference on SImilarity Search and APplications, SISAP 2010, pp. 67–74. ACM, New York (2010)Google Scholar
  19. 19.
    Wilson, W.A.: A relation between metric and Euclidean spaces. Am. J. Math. 54(3), 505–517 (1932)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity search - the metric space approach. In: Advances in Database Systems (2006)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computing ScienceUniversity of StirlingStirlingScotland
  2. 2.School of Computer ScienceUniversity of St AndrewsSt AndrewsScotland

Personalised recommendations