Skip to main content

Improving Metric Access Methods with Bucket Files

  • Conference paper
  • First Online:
Similarity Search and Applications (SISAP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9371))

Included in the following conference series:

Abstract

Modern applications deal with complex data, where retrieval by similarity plays an important role in most of them. Complex data whose primary comparison mechanisms are similarity predicates are usually immersed in metric spaces. Metric Access Methods (MAMs) exploit the metric space properties to divide the metric space into regions and conquer efficiency on the processing of similarity queries, like range and k-nearest neighbor queries.

Existing MAM use homogeneous data structures to improve query execution, pursuing the same techniques employed by traditional methods developed to retrieve scalar and multidimensional data. In this paper, we combine hashing and hierarchical ball partitioning approaches to achieve a hybrid index that is tuned to improve similarity queries targeting complex data sets, with search algorithms that reduce total execution time by aggressively reducing the number of distance calculations. We applied our technique in the Slim-tree and performed experiments over real data sets showing that the proposed technique is able to reduce the execution time of both range and k-nearest queries to at least half of the Slim-tree. Moreover, this technique is general to be applied over many existing MAM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Almeida, J., Torres, R.d.S., Leite, N.J.: Bp-tree: an efficient index for similarity search in high-dimensional metric spaces. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM 2010, pp. 1365–1368. ACM, New York (2010)

    Google Scholar 

  2. Bozkaya, T., Özsoyoglu, Z.M.: Distance-based indexing for high-dimensional metric spaces. In: ACM SIGMOD International Conference on Management of Data, Tucson, AZ, pp. 357–368. ACM Press (1997)

    Google Scholar 

  3. Brin, S.: Near neighbor search in large metric spaces. In: Dayal, U., Gray, P.M.D., Nishio, S. (eds.) International Conference on Very Large Databases (VLDB), pp. 574–584. Morgan Kaufmann, Zurich (1995)

    Google Scholar 

  4. Ciaccia, P, Patella, M., Rabitti, F., Zezula, P.: Indexing metric spaces with m-tree. In: Atti del Quinto Convegno Nazionale SEBD, Verona, Italy, pp. 67–86 (1997)

    Google Scholar 

  5. Dohnal, V., Gennaro, C., Savino, P., Zezula, P.: D-index: Distance searching index for metric data sets. Multimedia Tools and Applications Journal (MTAJ) 21(1), 9–33 (2003)

    Article  Google Scholar 

  6. Faloutsos, C.: Indexing of multimedia data. In: Multimedia Databases in Perspective, pp. 219–245. Springer Verlag (1997)

    Google Scholar 

  7. Gennaro, C., Savino, P., Zezula, P.: Similarity search in metric databases through hashing. In: 3rd International Workshop on Multimedia Information Retrieval, Ottawa, Canada, pp. 1–5 (2001)

    Google Scholar 

  8. Kelley, J.L.: General Topology. Springer (1955)

    Google Scholar 

  9. Micó, L., Oncina, J., Vidal, E.: A new version of the nearest-neighbor approximating and eliminating search (aesa) with linear processing-time and memory requirements. Pattern Recognition Letters 15, 9–17 (1994)

    Article  Google Scholar 

  10. Navarro, G., Uribe-Paredes, R.: Fully dynamic metric access methods based on hyperplane partitioning. Inf. Syst. 36, 734–747 (2011)

    Article  Google Scholar 

  11. Santos Filho, R.F., Traina, A.J.M., Traina Jr., C., Faloutsos, C.: Similarity search without tears: the omni family of all-purpose access methods. In: IEEE International Conference on Data Engineering (ICDE), Heidelberg, Germany, pp. 623–630. IEEE Computer Society (2001)

    Google Scholar 

  12. Skopal, T.: Where are you heading, metric access methods?: a provocative survey. In: Proceedings of the Third International Conference on SImilarity Search and APplications, SISAP 2010, pp. 13–21. ACM, New York (2010)

    Google Scholar 

  13. Traina Jr, C., Traina, A.J.M., Faloutsos, C., Seeger, B.: Fast indexing and visualization of metric datasets using slim-trees. IEEE Transactions on Knowledge and Data Engineering (TKDE) 14(2), 244–260 (2002)

    Article  Google Scholar 

  14. Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Fourth Annual ACM/SIGACT-SIAM Symposium on Discrete Algorithms (SODA), Austin, TX, pp. 311–321 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ives R. V. Pola or Daniel S. Kaster .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Pola, I.R.V., Traina, A.J.M., Traina, C., Kaster, D.S. (2015). Improving Metric Access Methods with Bucket Files. In: Amato, G., Connor, R., Falchi, F., Gennaro, C. (eds) Similarity Search and Applications. SISAP 2015. Lecture Notes in Computer Science(), vol 9371. Springer, Cham. https://doi.org/10.1007/978-3-319-25087-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25087-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25086-1

  • Online ISBN: 978-3-319-25087-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics