Abstract
A novel concept of a metric hull has recently been introduced to encompass a set of objects by a few selected border objects. Following one of the metric-hull computation methods that generate a hierarchy of metric hulls, we introduce a metric index structure for unstructured and complex data, a Metric Hull Tree (MH-tree). We propose a construction of MH-tree by a bulk-loading procedure and outline an insert operation. With respect to the design of the tree, we provide an implementation of an approximate kNN search operation. Finally, we utilized the Profimedia dataset to evaluate various building and ranking strategies of MH-tree and compared the results with M-tree.
The publication of this paper and the follow-up research was supported by the ERDF “CyberSecurity, CyberCrime and Critical Information Infrastructures Center of Excellence” (No. CZ.02.1.01/0.0/0.0/16_019/0000822).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Or even distance spaces where no implicit coordinate system is defined.
References
Amato, G., Gennaro, C., Savino, P.: MI-File: using inverted files for scalable approximate similarity search. Multimedia Tools Appl. 71(3), 1333–1362 (2012). https://doi.org/10.1007/s11042-012-1271-1
Antol, M., Janosova, M., Dohnal, V.: Metric hull as similarity-aware operator for representing unstructured data. Pattern Recognit. Lett. 1–8 (2021). https://doi.org/10.1016/j.patrec.2021.05.011
Batko, M.: Distributed and scalable similarity searching in metric spaces. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 44–53. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30192-9_5
Batko, M., Dohnal, V., Zezula, P.: M-grid: similarity searching in grid. In: P2PIR 2006: International Workshop on Information Retrieval in Peer-to-Peer Networks (2006). https://doi.org/10.1145/1183579.1183583
Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), 322–373 (2001). https://doi.org/10.1145/502807.502809
Brin, S.: Near neighbor search in large metric spaces. In: Proceedings of the International Conference on Very Large Data Bases (1995)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB), pp. 426–435. Morgan Kaufmann (1997)
Hetland, M.L.: Comparison-based indexing from first principles. arXiv preprint arXiv:1908.06318 (2019)
Jánošová, M.: Representing sets of unstructured data. Master thesis, Masaryk University, Faculty of Informatics (2020). https://is.muni.cz/th/vqton/
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM (2017). https://doi.org/10.1145/3065386
Laverde, N.A., Cazzolato, M.T., Traina, A.J., Traina, C.: Semantic similarity group by operators for metric data. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds.) Similarity Search and Applications. LNCS, vol. 10609, pp. 247–261. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68474-1_17
Mic, V., Novak, D., Zezula, P.: Binary sketches for secondary filtering. ACM Trans. Inf. Syst. 37(1), 1:1–1:28 (2019). https://doi.org/10.1145/3231936
Moriyama, A., Rodrigues, L.S., Scabora, L.C., Cazzolato, M.T., Traina, A.J.M., Traina, C.: VD-Tree: how to build an efficient and fit metric access method using voronoi diagrams. In: Proceedings of the 36th Annual ACM Symposium on Applied Computing (SAC), pp. 327–335. ACM, New York (2021)
Novak, D., Batko, M., Zezula, P.: Large-scale image retrieval using neural net descriptors. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1039–1040. ACM (2015)
Paredes, R.U., Navarro, G.: EGNAT: a fully dynamic metric access method for secondary memory. In: 2nd International Workshop on Similarity Search and Applications, SISAP 2009 (2009). https://doi.org/10.1109/SISAP.2009.20
Pola, I.R.V., Traina, C., Traina, A.J.M.: The NOBH-tree: improving in-memory metric access methods by using metric hyperplanes with non-overlapping nodes. Data Knowl. Eng. (2014). https://doi.org/10.1016/j.datak.2014.09.001
Procházka, D.: Indexing structure based on metric hulls. Bachelor thesis, Masaryk University, Faculty of Informatics (2021). https://is.muni.cz/th/jk21s/
Samet, H.: Foundations of Multidimensional and Metric Data Structures. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann (2006)
Skopal, T., Pokorný, J., Snasel, V.: PM-tree: Pivoting Metric Tree for Similarity Search in Multimedia Databases. ADBIS, Computer and Automation Research Institute Hungarian Academy of Science (2004)
Skopal, T., Pokorný, J., Snášel, V.: Nearest neighbours search using the PM-tree. In: Zhou, L., Ooi, B.C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 803–815. Springer, Heidelberg (2005). https://doi.org/10.1007/11408079_73
Smith, J.R.: MPEG7 standard for multimedia databases. SIGMOD Record (2001). https://doi.org/10.1145/376284.375814
Traina, C., Traina, A., Faloutsos, C., Seeger, B.: Fast indexing and visualization of metric data sets using Slim-trees. IEEE Trans. Knowl. Data Eng. (2002). https://doi.org/10.1109/69.991715
Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40(4), 175–179 (1991)
Vilar, J.M.: Reducing the overhead of the AESA metric-space nearest neighbour searching algorithm. Inf. Process. Lett. 56(5), 265–271 (1995)
Zhou, X., Wang, G., Yu, J.X., Yu, G.: M+-tree: a new dynamical multidimensional index for metric spaces. In: Proceedings of the 14th Australasian Database Conference, pp. 161–168 (2003)
Zhou, X., Wang, G., Zhou, X., Yu, G.: BM\(^{+}\)-tree: a hyperplane-based index method for high-dimensional metric spaces. In: Zhou, L., Ooi, B.C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 398–409. Springer, Heidelberg (2005). https://doi.org/10.1007/11408079_36
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jánošová, M., Procházka, D., Dohnal, V. (2021). Organizing Similarity Spaces Using Metric Hulls. In: Reyes, N., et al. Similarity Search and Applications. SISAP 2021. Lecture Notes in Computer Science(), vol 13058. Springer, Cham. https://doi.org/10.1007/978-3-030-89657-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-89657-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89656-0
Online ISBN: 978-3-030-89657-7
eBook Packages: Computer ScienceComputer Science (R0)