Abstract
Similarity queries are fundamental operations for applications that deal with complex data. This work proposes a new approach, called MIA (Metric Indexing Assisted by auxiliary memory with limited capacity), that can be employed to create dynamic metric access methods, such as M-trees and Slim-trees, through a short-term memory. We propose three strategies that were evaluated with various datasets and employing different node split policies. Experimental results show that metric access methods built with the MIA approach present better distribution of the elements in the index nodes when compared to the access methods built without it. Moreover, these results show the strategies decrease the overlap, the number of distance calculations, the number of disk accesses and the execution time to run k-nearest neighbor queries.
This work has been supported by CNPq (Brazilian National Council for Supporting Research), by CAPES (Brazilian Coordination for Improvement of Higher Level Personnel) and by PROPP/UFU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arboretum library. https://bitbucket.org/gbdi/arboretum. Accessed July 2018
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: International Conference on Very Large Data Bases (VLDB), Greece, Athens, pp. 426–435 (1997)
Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). http://www.sisap.org/Metric_Space_Library.html
Gama, J.: A survey on learning from data streams: current and future trends. Progr. Artif. Intell. 1(1), 45–55 (2012)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: International Confernence on Management of Data (SIGMOD), Boston, MA, pp. 47–57 (1984)
Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Sciences, University of California, Irvine (2013). http://archive.ics.uci.edu/ml
Lokoc, J., Mosko, J., Cech, P., Skopal, T.: On indexing metric spaces using cut-regions. Inf. Syst. 43, 1–19 (2014)
Lokoc, J., Skopal, T.: On reinsertions in m-tree. In: International Workshop on Similarity Search and Applications (SISAP), pp. 121–128. IEEE (2008)
Navarro, G., Reyes, N.: New dynamic metric indices for secondary memory. Inf. Syst. 59, 48–78 (2016)
Ng, R.T., Han, J.: CLARANS: a method for clustering objects for spatial data mining. IEEE Trans. Knowl. Data Eng. 14(5), 1003–1016 (2002)
Oliveira, P.H., Traina, C., Kaster, D.S.: Improving the pruning ability of dynamic metric access methods with local additional pivots and anticipation of information. In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 18–31. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23135-8_2
Razente, H.L., Lima, R.L.B., Barioni, M.C.N.: Similarity search through one-dimensional embeddings. In: ACM Symposium on Applied Computing (SAC), Marrakech, Morocco, pp. 874–879 (2017)
Roussopoulos, N., Kelley, S., Vincent, F.: Nearest neighbor queries. In: International Conference on Management of Data (SIGMOD), San Jose, pp. 71–79 (1995)
Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2006)
Silva, Y.N., Aref, W.G., Larson, P.-A., Pearson, S., Ali, M.H.: Similarity queries: their conceptual evaluation, transformations, and processing. VLDB J. 22(3), 395–420 (2013)
Skopal, T.: On fast non-metric similarity search by metric access methods. In: Ioannidis, Y., et al. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 718–736. Springer, Heidelberg (2006). https://doi.org/10.1007/11687238_43
Souza, J., Razente, H., Barioni, M.C.: Optimizing metric access methods for querying and mining complex data types. J. Braz. Comput. Soc. (JBCS) 20(17), 14 (2014)
Traina, C., Traina, A., Faloutsos, C., Seeger, B.: Fast indexing and visualization of metric data sets using slim-trees. IEEE Trans. Knowl. Data Eng. (TKDE) 14(2), 244–260 (2002)
Traina, C., Traina, A., Filho, R.S., Faloutsos, C.: How to improve the pruning ability of dynamic metric access methods. In: International Conference on Information and Knowledge Management (CIKM), McLean, pp. 219–226 (2002)
Traina, C., Traina, A., Seeger, B., Faloutsos, C.: Slim-trees: high performance metric trees minimizing overlap between nodes. In: International Conference on Extending Database Technology (EDBT), Konstanz, pp. 51–65 (2000)
Vespa, T., Traina, C., Traina, A.: Efficient bulk-loading on dynamic metric access methods. Inf. Syst. 35(5), 557–569 (2010)
Vieira, M.R., Traina, C., Chino, F.J.T., Traina, A.: DBM-tree: a dynamic metric access method sensitive to local density data. J. Inf. Data Manag. (JIDM) 1(1), 111–127 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Razente, H., Sousa, R.M.S., Barioni, M.C.N. (2018). Metric Indexing Assisted by Short-Term Memories. In: Marchand-Maillet, S., Silva, Y., Chávez, E. (eds) Similarity Search and Applications. SISAP 2018. Lecture Notes in Computer Science(), vol 11223. Springer, Cham. https://doi.org/10.1007/978-3-030-02224-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-02224-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02223-5
Online ISBN: 978-3-030-02224-2
eBook Packages: Computer ScienceComputer Science (R0)